AFIT/DS/ENC/99-01 


Estimation  and  Goodness-of-Fit  in  the  Case  of 
Randomly  Censored  Lifetime  Data 

DISSERTATION 
David  M.  Reineke 

AFIT/DS/ENC/99-01 


jynO  QUAUTY  n»D?I3CTBD  1 

Approved  for  public  release;  distribution  unlimited 


The  views  expressed  in  this  dissertation  are  those  of  the  author  and  do  not  reflect  the  official  policy 
or  position  of  the  Department  of  Defense  or  the  United  States  Government. 


AFIT/DS/ENC/99-01 


Estimation  and  Goodness-of-Fit  in  the  Case  of  Randomly  Censored  Lifetime  Data 


DISSERTATION 


Presented  to  the  Faculty  of  the  Graduate  School  of  Engineering 
of  the  Air  Force  Institute  of  Technology 
Air  University 
In  Partial  Fulfillment  of  the 
Requirements  for  the  Degree  of 
Doctor  of  Philosophy 


David  M.  Reineke,  B.S.,  M.S. 


June,  1999 


Approved  for  public  release;  distribution  unlimited 


AFIT/DS/ENC/99-01 


Estimation  and  Goodness-of-Fit  in  the  Case  of  Randomly  Censored  Lifetime  Data 

David  M.  Reineke,  B.S.,  M.S. 

Approved; 


Dean’s  Representative 


Robert  A.  Calico,  Jr 
Dean 


Acknowledgements 


I  would  like  to  take  this  opportunity  to  thank  to  my  dissertation  advisor,  Major  John  S.  Crown,  for 
his  patient  guidance  and  good  humor,  to  Major  Edward  A.  Pohl  for  his  creativity  and  inspiration, 
to  Dr.  Mark  E.  Oxley  for  his  technical  expertise  in  the  area  of  functional  analysis,  and  especially 
to  Dr.  Albert  H.  Moore  for  sharing  his  profound  insight  and  a  lifetime  of  statistical  knowledge. 
Furthermore,  I  am  indebted  to  Dr.  Moore  for  introducing  me  to  the  “AFIT  family”  and  guiding  me 
into  an  area  of  research  so  rich  with  opportunity.  I  would  like  to  express  my  appreciation  to  Mrs. 
Kristen  Larsen  for  her  diligent  system  administration  in  the  Computational  Dynamics  and  Design 
Lab,  where  much  of  this  work  was  done.  I  am  also  grateful  for  the  excellent  faculty,  computing 
facilities,  and  library  resources  at  AFIT. 

Finally,  I  am  infinitely  grateful  for  the  unconditional  love,  support,  patience,  and  understand- 
ing  of  my  wife  and  children  throughout  the  preparation  of  this  dissertation. 


David  M.  Reineke 


iii 


Table  of  Contents 


Page 

Acknowledgements .  iii 

List  of  Figures .  viii 

List  of  Tables .  xi 

Abstract .  xiii 

I.  Background  and  Problem  Statement .  1 

1.1  Background .  1 

1.2  Problem  Statement  .  3 

1.3  The  Competing  Risks  Model  of  Random  Censoring .  6 

1.4  Organization  of  the  Dissertation .  7 

II.  Estimation .  9 

2.1  Parametric  Estimation .  9 

2.1.1  Maximum  Likelihood  using  the  Censored-Data  Likelihood  Func¬ 
tion  .  9 

2.1.2  New  Minimum  Distance  Methods  for  Randomly  Censored  Data  10 

2.1.3  Estimating  the  Parameters  of  the  3-Parameter  Weibull  distri¬ 
bution  .  14 

2.2  Nonparametric  Estimation .  23 

2.2.1  The  Kaplan-Meier  Product-Limit  Estimator .  23 

2.2.2  The  Mean  Order  Number  Estimator .  25 

2.2.3  The  Piecewise  Exponential  Estimator .  26 

2.2.4  Kernel  Estimators .  28 

2.2.5  New  Trigonometrically-Smoothed  and  Jackknifed  Estimators 

for  Randomly  Censored  Data .  30 


IV 


Page 


2.3  Semi-Parametric  Estimation .  33 

2.3.1  The  Klein,  Lee,  and  Moeschberger  Partially  Parametric  Esti¬ 
mator  .  33 

2.3.2  A  Semi-Parametric  Kaplan-Meier  Estimator .  34 

2.4  A  Comparison  of  Distribution  Function  Estimators .  35 

III.  Goodness-of-Fit  .  43 

3.1  Literature  Review .  43 

3.1.1  Tests  of  Simple  Hypothesis .  43 

3.1.2  Tests  of  Composite  Hypothesis .  44 

3.2  Asymptotic  Distributions  of  KME-Modified  Test  Statistics .  45 

3.2.1  Literature  Review  of  General  Asymptotic  Theory .  45 

3.2.2  Problems  with  the  Asymptotic  Distribution  of  Test  Statistics 

in  Tests  of  Exponentiality  with  Exponential  Censoring  ....  48 

3.2.3  Problems  with  the  Asymptotic  Distribution  of  Test  Statistics 

when  Testing  for  the  Weibull  Distribution  Within  the  Propor¬ 
tional  Hazards  Model  of  Random  Censorship .  50 

3.2.4  Remarks  on  Asymptotic  Distributions  of  Goodness-of-Fit  Statis¬ 
tics  .  51 

3.3  Some  Justification  for  the  Assumption  of  an  Exponentially  Distributed 

Censoring  Variable .  52 

3.4  Modified  EDF  Statistics  for  New  Goodness-of-Fit  Tests  for  Randomly 

Censored  Data . 53 

3.4.1  Computing  Formulas .  53 

3.4.2  A  Condition  for  Location  and  Scale  Invariance .  55 

3.5  New  Goodness-of-Fit  Tests  for  Exponential  Lifetimes  with  Exponen¬ 
tially  Distributed  Random  Right  Censoring  for  a  Composite  Hypothesis  56 

3.5.1  New  Tests  Based  on  KME-Modified  Cramer-von  Mises  and 

Anderson-Darling  Test  Statistics .  56 

3.5.2  Burke’s  Test  for  Exponentiality  for  a  Composite  Hypothesis  .  58 


V 


Page 


3.6  New  Goodness-of-Fit  Tests  for  the  Weibull  with  Exponentially  Dis¬ 
tributed  Random  Right-Censoring  for  a  Composite  Hypothesis  ....  59 

3.6.1  New  Tests  Based  on  Modified  Cramer-von  Mises  and  Anderson- 

Darling  Statistics  for  Unknown  Location  and  Scale .  59 

3.7  New  Goodness-of-Fit  Tests  Based  on  Crude  Lifetimes .  60 

3.8  New  Semi-Parametric  Goodness-of-Fit  Tests  Based  on  Crude  Lifetimes  70 

3.9  Power  Studies  .  76 

3.9.1  Exponential  Failure  with  Exponential  Censoring .  76 

3.9.2  Weibull  Failure  with  Exponential  Censoring .  80 

IV.  Summary  and  Conclusions .  87 

Appendix  A.  Numerical  Integration  with  Simpson’s  Rule .  93 

Appendix  B.  An  Interesting  Net  Lifetime  Result  from  Tests  Based  on  Crude  Lifetimes  94 

Appendix  C.  Plots  Illustrating  Estimation  Techniques  for  Randomly  Censored  Data  97 

Appendix  D.  Percentage  Points  for  New  Tests  for  the  Exponential  Distribution  .  .  113 

Appendix  E.  Percentage  Points  for  New  Tests  for  the  Weibull  Distribution .  120 

Appendix  F.  Plots  and  Tables  of  Power  Study  Results  for  Tests  of  Exponentiaiity  .  125 

Appendix  G.  Matlab  Code  for  2-Parameter  Weibull  MLE .  149 

Appendix  H.  Matlab  Code  for  Minimum  Distance  Estimation  of  a  Weibull  Location 

Parameter .  151 

Appendix  I.  Matlab  Code  for  Distribution  Function  Estimators  and  MISE .  156 

Appendix  J.  Percentage  Point  Generation  for  the  Exponential  with  Exponential  Cen¬ 
soring  .  166 

Appendix  K.  Percenatge  Point  Generation  for  the  Weibull  with  Exponential  Censoring  170 


VI 


Page 


Appendix  L.  Power  Study  for  the  Exponential  with  Exponential  censoring .  176 

Appendix  M.  Power  Study  for  the  Weibull  with  Exponential  Censoring .  184 

Bibliography .  190 

Vita .  199 


List  of  Figures 


Figure  Page 

1.  An  Example  of  the  Skewness  of  Integrated  Squared  Error  at  Sample  Size  20.  22 

2.  Illustration  of  Upper  Integration  Limit  for  the  KME-Modified  Cramer- von 

Mises  Statistic .  54 

3.  Net  Lifetime  when  Crude  Lifetimes  are  WEI(2.5,  900)  and  WEI(1.5,  600)  at 

50%  Expected  Censoring .  67 

4.  Net  Lifetime  when  Crude  Lifetimes  are  GAM(3,  500)  and  LOGN(1.5,  600))  at 

50%  Expected  Censoring .  68 

5.  Net  Lifetime  for  the  Leukemia  Remission  Times  when  Crude  Lifetimes  are 

WEI(2.03,  13.77)  and  WEI(2.22,  23.60))  at  57%  Censoring .  69 

6.  Net  Lifetime  CDF  for  Leukemia  Remission  Times  from  Crude  Lifetimes  WEI(2.03, 

13.77)  and  WEI(2.22,  23.60))  at  57%  Censoring .  73 

7.  Empirical  Survivor  function  of  Crude  Life  Censoring  Times  for  Semi-Parametric 

Crude  Life  Test .  74 

8.  CDF  Comparison  for  Semi-Parametric  Crude  Life  Test  with  Crude  Life  WEI(0.89, 34.14).  75 

9.  PDF’s  of  Alternative  Distributions  Used  in  the  Power  Study  of  Tests  for  Ex- 

ponentiality .  79 

10.  PDF’s  of  Alternative  Distributions  Used  in  the  Power  Study  of  Tests  for  the 

Weibull  (shape  ^  =  2) .  81 

11.  PDF’s  of  Alternative  Distributions  Used  in  the  Power  Study  of  Tests  for  the 

Weibull  (shape  P  =  3.5) .  83 

12.  Net  Lifetime  when  the  Crude  Censoring  Lifetime  is  WEI(2.5,  25)  and  the 

Crude  Lifetime  of  Interest  is  WEI(2,  15)  at  50%  Expected  Censoring .  96 

13.  Maximum  Likelihood  Estimator,  Exponential  Distribution .  98 

14.  Maximum  Likelihood  Estimator,  Weibull  with  shape  2 .  98 

15.  Maximum  Likelihood  Estimator,  Weibull  with  shape  3.5 .  99 

16.  Kaplan-Meier  Estimator,  Exponential  Distribution .  99 

17.  Kaplan-Meier  Estimator,  Weibull  with  shape  2 .  100 

18.  Kaplan-Meier  Estimator,  Weibull  with  shape  3.5 .  100 

viii 


Figure  Page 

19.  Mean  Order  Number  Estimator,  Exponential  Distribution .  101 

20.  Mean  Order  Number  Estimator,  Weibull  with  shape  2 .  101 

21.  Mean  Order  Number  Estimator,  Weibull  with  shape  3.5 .  102 

22.  Piecewise  Exponential  Estimator,  Exponential  Distribution .  102 

23.  Piecewise  Exponential  Estimator,  Weibull  with  shape  2 .  103 

24.  Piecewise  Exponential  Estimator,  Weibull  with  shape  3.5 .  103 

25.  Blum-Susarla  Kernel  Estimator,  Exponential  Distribution .  104 

26.  Blum-Susarla  Kernel  Estimator,  Weibull  with  shape  2 .  104 

27.  Blum-Susarla  Kernel  Estimator,  Weibull  with  shape  3.5 .  105 

28.  Foldes,  Rejto,  and  Winter  Kernel  Estimator,  Exponential  Distribution.  .  .  .  105 

29.  Foldes,  Rejto,  and  Winter  Kernel  Estimator,  Weibull  with  shape  2 .  106 

30.  Foldes,  Rejto,  and  Winter  Kernel  Estimator,  Weibull  with  shape  3.5 .  106 

31.  Trigonometrically-Smoothed  KME,  Exponential  Distribution .  107 

32.  Trigonometrically-Smoothed  KME,  Weibull  with  shape  2 .  107 

33.  Trigonometrically-Smoothed  KME,  Weibull  with  shape  3.5 .  108 

34.  Trigonometrically-Smoothed  and  Jackknifed  KME,  Exponential  Distribution.  108 

35.  Trigonometrically-Smoothed  and  Jackknifed  KME,  Weibull  with  shape  2.  .  .  109 

36.  Trigonometrically-Smoothed  and  Jackknifed  KME,  Weibull  with  shape  3.5.  .  109 

37.  Klein,  Lee,  and  Moeschberger  Estimator,  Exponential  Distribution .  110 

38.  Klein,  Lee,  and  Moeschberger  Estimator,  Weibull  with  shape  2 .  110 

39.  Klein,  Lee,  and  Moeschberger  Estimator,  Weibull  with  shape  3.5 .  Ill 

40.  Semi-Parametric  Kaplan-Meier  Estimator,  Exponential  Distribution .  Ill 

41.  Semi-Parametric  Kaplan-Meier  Estimator,  Weibull  with  shape  2 .  112 

42.  Semi-Parametric  Kaplan-Meier  Estimator,  Weibull  with  shape  3.5 .  112 

43.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Ex¬ 
ponential,  a  —  0.10 .  134 

44.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Ex¬ 
ponential,  a  —  0,05 .  135 


IX 


Figure  Page 

45.  Power  Comparison  of  Tests  for  Exponent iality,  Underlying  Distribution  is  Ex¬ 
ponential,  a  =  0.025 .  136 

46.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is 

Weibull  (shape=2)„  a  =  0.10 .  137 

47.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is 

Weibull  (shape=2)„  a  =  0.05 .  138 

48.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is 

Weibull  (shape=2),  a  =  0.025 .  139 

49.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is 

Gamma  (shape— 1.5),  a  =  0.10 .  140 

50.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is 

Gamma  (shape=1.5),  a  =  0.05 .  141 

51.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is 

Gamma  (shape=1.5),  a  —  0.025 .  142 

52.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is 

Gamma  (shape=2),  a  =  0.10 .  143 

53.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is 

Gamma  (shape=2),  a  —  0.05 .  144 

54.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is 

Gamma  (shape=2),  a  =  0.025 .  145 

55.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Log¬ 
normal  from  N(0,1),  a  =  0.10 .  146 

56.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Log¬ 
normal  from  N(0,1),  a  —  0.05 .  147 

57.  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Log¬ 
normal  from  N(0,1),  a  —  0.025 .  148 


X 


List  of  Tables 


Table 

1. 

2. 

3. 

4. 

5. 

6. 

7. 

8. 

9. 

10. 
11. 

12. 

13. 

14. 

15. 

16. 

17. 

18. 

19. 

20. 

21. 


Page 


Parameter  Values  of  the  3-parameter  Weibull  for  MISE  Comparison .  18 

Key  to  Abbreviations  of  MD/ML  Estimators  for  the  3-Parameter  Weibull  Dis¬ 
tribution .  19 

3-Parameter  Weibull,  shape  /3  =  2,  with  Exponential  Censoring .  20 

3-Parameter  Weibull,  shape  ^  =  3.5,  with  Exponential  Censoring .  21 

P-values  of  the  Kruskal- Wallis  Test  for  ISE  Comparison  of  MD/ML  Estimators.  23 

Parameter  Values  for  MISE  Comparison .  36 

Key  to  Abbreviations  of  Estimators .  36 

MISE:  Exponential  with  Exponential  Censoring .  37 

MISE:  2-Parameter  Weibull,  shape  2,  with  Exponential  Censoring .  38 

MISE:  2-Parameter  Weibull,  shape  3.5,  with  Exponential  Censoring .  39 

Previous  Goodness-of-Fit  Tests  for  Randomly  Censored  Data  with  Composite 
Ho .  45 

Percentage  Points  of  for  the  Exponential  Distribution  with  Exponential 
Censoring .  59 

Summary  of  Test  Results  for  the  Prostate  Cancer  Data .  72 

Empirical  Power  of  Modified  and  Statistics  at  a  =  0.10 .  82 

Empirical  Power  of  Modified  and  Statistics  at  a  =  0.05 .  84 

Empirical  Power  of  Modified  and  A^  .^  Statistics  at  a  =  0.025 .  84 

Empirical  Power  of  Modified  and  A‘^  .^  Statistics  at  a  ==  0.10 .  85 

Empirical  Power  of  Modified  and  Statistics  at  a  =  0.05 .  85 

Empirical  Power  of  Modified  and  Statistics  at  a  =  0.025 .  86 

Percentage  Points  of  for  the  Exponential  Distribution  with  Exponential 
Censoring .  114 

Percentage  Points  of  for  the  Exponential  Distribution  with  Exponential 
Censoring  (cont’d) .  115 


XI 


Table  Page 

22.  Percentage  Points  of  for  the  Exponential  Distribution  with  Exponential 

Censoring(cont ’d) .  116 

23.  Percentage  Points  of  ^4^  „  for  the  Exponential  Distribution  with  Exponential 

Censoring .  117 

24.  Percentage  Points  of  for  the  Exponential  Distribution  with  Exponential 

Censoring  (cont’d) .  118 

25.  Percentage  Points  of  for  the  Exponential  Distribution  with  Exponential 

Censoring  (cont’d) .  119 

26.  Percentage  Points  of  for  the  Weibull  Distribution  with  Shape  /3  =  2  and 

Exponential  Censoring .  121 

27.  Percentage  Points  of  for  the  Weibull  Distribution  with  Shape  /3  =  3.5 

and  Exponential  Censoring .  122 

28.  Percentage  Points  of  Al  „^  for  the  Weibull  Distribution  with  Shape  /?  =  2  and 

Exponential  Censoring .  123 

29.  Percentage  Points  of  A^  .^  for  the  Weibull  Distribution  with  Shape  P  =  3.5  and 

Exponential  Censoring .  124 

30.  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.10  and  q  =  0.20.  ,  .  .  125 

31.  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  ==  0.10  and  q  =  0.50.  .  .  .  126 

32.  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.10  and  q  =  0,80.  .  .  .  127 

33.  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.05  and  q  =  0.20.  .  .  .  128 

34.  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.05  and  q  =  0.50.  .  .  .  129 

35.  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.05  and  q  =  0.80,  .  .  .  130 

36.  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.025  and  q  =  0.20.  .  .  131 

37.  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0,025  and  q  =  0.50.  .  .  132 

38.  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.025  and  q  =  0.80.  .  .  133 

xii 


AFIT/DS/ENC/99-01 


Abstract 

Several  parametric,  nonparametric,  and  semi-parametric  estimators  of  the  distribution  func¬ 
tion  of  a  randomly  right-censored  random  variable  are  compared.  A  new  continuous  distribution 
function  estimator  for  randomly  censored  data  is  developed,  discussed,  and  compared  to  existing 
estimators.  Minimum  distance  estimation  is  shown  to  be  effective  in  estimating  Weibull  location 
parameters  when  random  censoring  is  present.  A  method  of  estimating  all  3  parameters  of  the 
3-parameter  Weibull  distribution  using  a  combination  of  minimum  distance  and  maximum  likeli¬ 
hood  is  also  given.  The  mean  integrated  squared  error  is  estimated  for  each  estimator  using  Monte 
Carlo  simulation  and  Kruskal- Wallis  tests  are  used  to  discern  which  estimators  are  the  best  in 
the  sense  of  having  the  smallest  integrated  squared  error.  A  number  of  new  goodness-of-fit  tests 
for  randomly  censored  data  with  a  composite  hypothesis  are  introduced.  Cramer-von  Mises  and 
Anderson-Darling  goodness-of-fit  test  statistics  are  modified  to  measure  the  discrepancy  between 
the  maximum  likelihood  estimate  and  the  Kaplan-Meier  product  limit  estimate  of  the  distribution 
function  of  the  random  variable  of  interest.  These  modified  test  statistics  are  used  to  construct 
goodness-of-fit  tests  for  the  exponential,  Weibull  (shape  2),  and  Weibull  (shape  3.5)  distributions 
when  the  censoring  distribution  is  assumed  to  be  exponential.  Percentage  points  are  obtained  via 
Monte  Carlo  simulation.  Another  test  for  the  exponential  with  exponential  censoring  is  constructed 
based  on  the  knowledge  that  the  minimum  of  exponentials  is  also  exponential.  More  generally,  el¬ 
ements  of  competing  risks  theory  are  used  to  build  goodness-of-fit  tests  using  crude  lifetimes.  One 
type  of  test  requires  a  parametric  fit  to  the  crude  lifetimes  of  both  the  variable  of  interest  and  the 
censoring  random  variable  while  the  other  type  relies  on  the  empirical  survivor  function  of  the  crude 
lifetime  of  the  censoring  variable.  In  either  case  the  assumption  of  an  exponentially  distributed 
censoring  variable  and  special  estimation  techniques  are  no  longer  required,  bringing  much  more 
fiexibility  to  goodness-of-fit  testing  when  samples  are  randomly  right-censored.  The  powers  of  the 

xiii 


new  KME-modified  Cramer-von  Mises  and  Anderson-Darling  goodness-of-fit  tests  as  well  as  the 
new  tests  based  on  crude  lifetimes  are  compared  to  each  other  and  to  existing  tests  by  Burke  and 
Chen  in  the  case  of  the  test  for  exponentiality.  Examples  of  crude  life  tests  are  presented  and 
discussed. 


XIV 


Estimation  and  Goodness-of-Fit  in  the  Case  of  Randomly  Censored  Lifetime  Data 


I.  Background  and  Problem  Statement 

1.1  Background 

The  time  until  the  occurrence  of  an  event  is  randomly  censored  on  the  right  when  the  observed 
portion  of  the  time  is  arbitrarily  truncated  before  the  event,  or  failure,  can  be  observed.  This 
situation  can  be  modeled  using  two  random  variables:  the  variable  of  interest  and  a  censoring 
variable.  The  random  variable  of  interest  is  randomly  censored  on  the  right  by  the  censoring 
variable  when  only  the  minimum  of  the  two  can  be  observed.  This  is  a  competing  risks  model  with 
two  risks.  The  observer  is  usually  able  to  distinguish  failure  times  from  censoring  times.  We  will 
consider  only  the  scenario  where  failure  times  are  distinguishable  from  censoring  times  and  restrict 
our  attention  to  continuous  univariate  probability  distributions  throughout  the  dissertation. 

The  objective  is  to  fit  a  random  sample  of  the  failure  times  with  a  parametric  model.  Random 
censorship  makes  this  task  more  difficult  because  the  actual  failure  time  is  not  known  when  an  item’s 
lifetime  is  censored.  What  is  known  is  that  the  censored  observation  survived  at  least  until  the  time 
of  censoring.  To  disregard  the  censored  items  and  use  only  the  observed  failures  would  not  provide 
an  accurate  representation  of  the  true  underlying  process.  Furthermore,  the  information  provided 
by  the  knowledge  that  the  suspended  or  censored  item  survived  for  a  certain  time  period  would  be 
lost. 

Some  of  the  goodness-of-fit  procedures  for  randomly  censored  data  when  the  null  distribution 
is  completely  specified  require  the  assumption  that  the  censoring  distribution  has  a  hazard  rate 
that  is  proportional  to  that  of  the  failure  distribution.  This  proportional  hazards  model  of  random 
censorship  has  been  explored  by  Koziol  and  Green  [81],  Koziol  [80],  and  Csorgo  and  Horvath  [31] 
among  others.  Many  authors  refer  to  this  as  the  Koziol-Green  model  of  random  censorship.  Distri- 


1 


bution  theory  for  many  goodness-of-fit  statistics  is  very  difficult,  possibly  intractable,  particularly 
when  parameters  are  estimated  from  data  and  especially  when  data  is  randomly  censored.  As  a 
result,  tables  of  percentage  points  are  not  available  for  testing  goodness  of  fit.  The  development  of 
a  goodness-of-fit  test  can  be  approached  from  one  of  two  directions:  either  (1)  find  an  appropriate 
test  statistic,  preferably  one  with  high  power  against  a  wide  variety  of  alternatives,  and  attempt  to 
derive  the  distribution  of  percentage  points,  or  (2)  construct  a  test  statistic  in  such  a  way  that  it 
follows  a  known  distribution  for  which  percentage  points  are  readily  available. 

The  purpose  of  performing  a  goodness-of-fit  test  is  to  fit  a  random  sample  that  is  assumed 
to  be  independent  and  identically  distributed  with  a  parametric  model  in  order  to  facilitate  the 
characterization  of  the  random  variable  as  well  as  any  desired  statistical  inference,  reliability  anal¬ 
ysis,  or  maintenance  planning.  Quadratic  distance  measures  such  as  the  Cramer-von  Mises  and 
Anderson-Darling  statistics  provide  a  reasonable  measure  of  the  discrepancy  between  an  empirical 
distribution  function  (EDF)  and  the  distribution  function  of  some  parametric  family  with  either 
known  or  estimated  parameters  and  generally  have  superior  power  [123]  among  goodness-of-fit 
statistics  for  testing  with  uncensored  samples.  See  D’Agostino  and  Stephens  [32]  for  detailed  de¬ 
scriptions  of  these  statistics  and  more.  In  the  case  of  a  simple  hypothesis  where  parameters  are 
assumed  known  in  the  null  hypothesis,  the  test  statistic  will  measure  the  discrepancy  between  the 
EDF  and  a  hypothetically  true  distribution  function.  However,  in  a  goodness-of-fit  test  with  a 
composite  hypothesis  in  which  parameters  are  estimated  from  a  random  sample,  the  only  thing 
that  is  assumed  to  be  true  in  the  null  hypothesis  is  the  distributional  family  that  characterizes 
the  random  variable  of  interest.  EDF  test  statistics  measure  the  discrepancy  between  the  EDF,  a 
nonparametric  estimate,  and  a  parametrically  estimated  distribution  function  usually  obtained  by 
maximum  likelihood.  This  discrepancy  is  likely  to  be  smaller  when  parameters  are  estimated  from 
the  data  because  both  estimates  are  dependent  on  the  same  random  sample.  Although  parameters 
are  estimated  from  the  data,  the  purpose  of  the  goodness-of-fit  test  is  still  likely  to  be  satisfied  be- 


2 


cause  the  test  verifies  (or  rules  out)  a  parametric  family  of  distributions  that  will  (not)  adequately 
characterize  the  random  variable  of  interest. 

1.2  Problem  Statement 

Data  that  is  randomly  censored  complicates  the  estimation  procedures  and  alters  the  distribu¬ 
tion,  and  hence  the  percentage  points,  of  goodness-of-fit  test  statistics.  Furthermore,  the  EDF  is  no 
longer  available  as  an  estimator  of  the  distribution  function  under  random  censorship.  Researchers 
throughout  the  years  have  developed  both  parametric  and  nonparametric  estimation  techniques 
that  will  accommodate  density  and  distribution  function  estimation  for  randomly  censored  sam¬ 
ples.  These  estimators  can  be  used  to  build  goodness-of-fit  tests.  The  effectiveness  of  an  estimator 
is  often  measured  in  terms  of  mean  integrated  squared  error  (MISE).  The  smaller  the  MISE,  the 
better  the  estimator.  MISE  is  dependent  on: 

1.  sample  size; 

2.  the  proportion  of  censoring; 

3.  the  underlying  lifetime  distribution; 

4.  the  method  of  estimation; 

5.  and  possibly  the  distribution  of  the  censoring  random  variable. 

We  would  like  to  find  an  estimator  with  the  smallest  MISE  for  a  wide  variety  of  lifetime  distributions. 
The  problem  may  be  formulated  as  follows.  Let  M  denote  our  measure  of  MISE,  let  f  be  a  class  of 
distribution  function  estimators  that  can  be  used  with  randomly  censored  data,  and  let  P  represent 
a  set  of  distribution  function  families.  Given  F  E  P  find  F  e  S  that  minimizes  M{F^F). 

Another  method  for  comparing  the  relative  sizes  of  values  of  populations  is  through  the 
Kruskal- Wallis  test.  The  Kruskal- Wallis  test  is  a  nonparametric  test  and,  as  such,  requires  no 
assumptions  on  the  underlying  populations  being  compared.  Thus,  when  comparing  skewed  popu- 


3 


lations  such  as  integrated  squared  error  (ISE),  the  Kruskal- Wallis  test  should  augment  the  results 
seen  in  the  MISE  comparison  and  by  shedding  light  on  whether  the  ISE  values,  in  general,  for  one 
estimator  are  significantly  lower  than  another.  Therefore,  the  Kruskal- Wallis  test  will  play  a  role 
in  determining  which  estimators  are  better  than  others  for  the  scenarios  under  consideration. 

One  particular  problem  in  goodness-of-fit  testing  lies  in  the  precise  characterization  of  the 
distribution  of  the  goodness-of-fit  statistics.  The  distribution  of  a  goodness-of-fit  statistic  must 
be  known  or  estimated  in  order  to  know  when  to  reject  a  given  null  hypothesis  at  a  given  level  of 
significance  when  the  value  of  a  test  statistic  is  calculated  from  observed  data.  While  not  necessary, 
it  is  very  desirable  for  a  goodness-of-fit  test  statistic  to  be  invariant  to  location  and  scale  changes 
in  the  distribution  being  tested.  The  distribution  of  the  test  statistic  will  be  dependent  on  the 
censoring  process  as  well  as  the  amount  of  censoring  in  addition  to  the  distributional  family  being 
tested  and  any  shape  parameters  that  may  be  involved. 

The  primary  concern  here  within  the  context  of  the  goodness-of-fit  problem  is  with  the  com¬ 
posite  hypothesis,  where  parameters  must  be  estimated  in  order  to  test  fit.  With  randomly  censored 
lifetime  data,  the  power  of  a  goodness-of-fit  test  for  a  composite  null  hypothesis  at  a  given  signifi¬ 
cance  level  for  a  specified  alternative  distribution  is  dependent  on: 

1.  sample  size; 

2.  the  proportion  of  censoring; 

3.  the  family  selected  for  the  failure  distribution; 

4.  the  true  underlying  alternative  distribution; 

5.  the  censoring  distribution; 

6.  the  method  of  parameter  estimation; 

7.  the  method  of  nonparametric  distribution  function  estimation; 

8.  the  test  statistic. 


4 


Items  2,  5,  and  7  arise  in  the  presence  of  random  censoring.  Item  7  vanishes  in  the  case  of  goodness- 
of-fit  tests  using  crude  lifetimes,  which  are  given  in  Sections  3.7  and  3.8.  Power  is  defined  as  the 
probability  of  correctly  rejecting  the  null  hypothesis  when,  in  reality,  it  is  false.  Our  objective  is  to 
construct  a  test  statistic  that  has  the  highest  power  among  all  other  test  statistics  for  the  widest 
variety  of  alternatives  under  a  given  set  of  circumstances.  The  goodness-of-fit  problem  may  be 
stated  as  follows.  Let  T  be  a  set  of  goodness-of-fit  test  statistics  that  can  be  used  with  randomly 
censored  data,  be  a  set  of  hypothesized  distribution  function  families,  and  ^  be  a  set  of  alternative 
distribution  function  families.  Further,  let  V  denote  the  power  of  a  given  test  statistic  T  eT  which 
is  used  to  detect  an  alternative  distribution  G  €  G  when  the  hypothesized  distribution  is  F  €  .F. 
Given  F  E  T  and  G  eG^  find  T  eT  that  maximizes  F(r,  F,G). 

In  addition  to  deriving  a  goodness-of-fit  test  with  maximum  power  in  a  variety  of  settings, 
it  is  also  important  to  consider  fiexibility.  Consequently,  we  seek  to  maximize  the  fiexibility  of  a 
goodness-of-fit  test  by  avoiding  restrictive  assumptions  regarding  the  distribution  of  the  censoring 
variable.  It  is  in  this  aspect  that  tests  based  on  the  fit  of  crude  lifetimes  stand  alone  above  other 
tests  in  the  case  of  randomly  censored  data.  Until  now  the  problem  of  testing  goodness-of-fit  with 
randomly  censored  data  has  always  been  done  from  the  perspective  of  competing  net  lifetimes  and 
has  not  been  addressed  as  a  mixture  of  crude  lifetimes.  The  concepts  of  net  and  crude  lifetimes  are 
defined  and  discussed  in  Sections  3.7  and  3.8. 

The  problem  of  random  right-censoring  is  of  interest  to  the  Air  Force.  One  example  of  the 
need  to  overcome  this  problem  is  the  characterization  of  the  lifetime  distribution  of  the  Advanced 
Medium  Range  Air- to- Air  Missile  (AMR  A  AM)  and  its  components.  The  AMR  A  AM  is  an  active 
radar-guided  missile  carried  by  USAF  F-15  and  F-16  fighters.  As  the  missiles  accumulate  captive 
carry  hours  the  rigors  of  take-off,  fiight,  and  landing,  as  well  as  other  environmental  stresses  can  lead 
to  a  state  of  missile  failure  in  which  the  missile  is  no  longer  capable  of  being  launched.  However, 
if  a  missile  is  successfully  launched,  permanently  removed  from  the  field  for  any  reason,  then  it  is 


5 


known  to  have  survived  until  that  time  and  is  randomly  censored  because  it  is  no  longer  observable 
until  failure. 


1,3  The  Competing  Risks  Model  of  Random  Censoring 

The  most  widely  used  model  in  the  literature  to  represent  random  right-censoring  is  that  of 
two  independent  competing  risks.  The  following  notation  will  be  used  throughout  this  work.  Let 
T  be  the  random  variable  of  interest  with  failure  distribution  function  Frit),  survivor  function 
Srit)  =  1  —  Frit),  and  assume  the  density  function  /t(^)  =  F^{t)  exists.  Further,  let  C  be  a 
censoring  random  variable  with  distribution  function  Fc(c),  survivor  function  Sc{c)  =  1  —  F<7(c), 
and  assume  the  density  function  fc{c)  =  Fq{c)  exists.  The  expected  proportion  of  failures  is  given 
by 


P 


P[T<C] 


fT{t)Sc{t)dt. 


The  observed  random  sample  consists  ofxi  =  min{^i,  Ci},  i  —  1, . . . ,  n,  and  the  indicator  6i  = 
Throughout  this  dissertation,  the  notation  t(i),  i  =  1, . . . ,  r,  will  be  used  to  denote  the  set  of  ordered 
failure  times,  C(i),i  =  l,...,n  --  r,  will  denote  the  set  of  ordered  withdrawal  times,  and  X(^i)^i  = 
1, . . . ,  n,  will  denote  the  entire  set  of  ordered  observations,  failed  or  suspended.  Additionally,  r  will 
be  used  to  represent  the  sum  ^be  number  of  observed  failures  in  the  sample. 

In  competing  risks  theory,  a  net  lifetime  represents  the  lifetime  of  an  item  when  it  is  subject 
to  one  of  the  specified  risks  while  no  other  risks  are  present  and  a  crude  lifetime  represents  the 
lifetime  of  an  item  subject  to  one  of  the  specified  risks  when  ail  risks  are  present  [87:  p.  110].  Our 
objective  is  to  characterize  the  distribution  of  the  net  lifetime  for  the  risk  of  interest,  which  is  our 
underlying  failure  process.  As  competing  risks,  T  and  C  in  this  context  will  be  used  to  represent  the 
net  lifetimes  and  Yr  and  Yc  will  denote  the  crude  lives  that  correspond  to  the  net  lifetimes  T  and 


6 


C,  In  other  words,  T  represents  the  lifetime  of  the  random  variable  of  interest  when  no  censoring 
is  present  while  Yt  represents  the  lifetime  of  interest  in  the  presence  of  random  censoring.  When 
the  net  lifetimes  are  independent,  the  relationship  between  survivor  functions  of  X,TjC,Yt,  and 
icis 

Sx{t)  =  ST{t)Sc{t)=pSYAt)  +  (1  -p)SYc{ty 
A  proof  of  this  result  is  given  in  Section  3.7. 

1,4  Organization  of  the  Dissertation 

The  effectiveness  of  minimum  distance  estimation  of  Weibull  location  parameters  for  randomly 
censored  samples  is  demonstrated  using  the  Cramer-von  Mises  and  Anderson-Darling  statistics  in 
Section  2.1.2.  A  technique  for  estimating  all  three  parameters  of  a  3-parameter  Weibull  distri¬ 
bution  which  utilizes  both  minimum  distance  and  maximum  likelihood  methods  is  outlined  and 
demonstrated  in  Section  2.1.3.  In  Section  2.2  four  nonparametric  distribution  function  estimators 
are  presented  and  a  new  continuous  estimator  of  the  distribution  function  is  introduced.  The  new 
estimator  is  an  extension  of  AFIT  graduate  James  Sweeder’s  estimator  [131]  to  randomly  censored 
data,  an  idea  brought  forth  by  Dr.  Albert  Moore.  The  estimator  is  a  trigonometrically  smoothed 
and  jackknifed  Kaplan-Meier  estimator.  Two  semi-parametric  distribution  function  estimators  are 
also  presented  and  all  of  the  estimators  are  compared. 

Chapter  III  addresses  goodness-of-fit  tests  in  the  case  of  randomly  censored  lifetime  data. 
Asymptotic  results  for  a  class  of  goodness-of-fit  statistics  based  on  the  Kaplan-Meier  estimator  are 
summarized  and  examined  for  the  composite  hypothesis  case  when  the  lifetime  and  censoring  dis¬ 
tributions  are  both  exponential  as  well  as  the  Weibull  distribution  within  the  proportional  hazards 
model  of  censorship.  Section  3.3  provides  some  justification  for  the  use  of  the  exponential  distri¬ 
bution  to  model  the  random  censoring  variable.  Eight  new  goodness-of-fit  tests  are  introduced, 
including  four  new  tests  for  exponentiality  when  the  censoring  variable  is  exponential  and  four  new 


7 


tests  for  the  Weibull  distribution  for  shape  parameters  2  and  3.5  when  the  censoring  variable  is 
exponential.  Moreover,  two  promising  new  types  of  goodness-of-fit  procedures  for  use  when  samples 
are  randomly  censored  on  the  right  are  introduced,  one  using  a  simultaneous  fit  of  crude  lifetimes 
and  one  using  a  fit  to  the  crude  lifetime  of  the  random  variable  of  interest  only.  The  second  one  may 
be  classified  as  a  partially  parametric  test  of  fit  because  the  crude  lifetime  is  fit  with  a  parametric 
family  while  the  resulting  net  lifetime  depends  on  the  EDF  of  the  crude  lifetime  of  the  censoring 
random  variable.  This  approach  to  goodness-of-fit  testing  with  randomly  censored  data  provides 
a  tremendous  increase  in  the  applicability  and  flexibility  of  the  tests  because  it  requires  no  special 
estimation  techniques,  no  special  test  statistics,  and  no  special  assumptions  since  complete  sample 
goodness-of-fit  tests  are  used.  The  only  necessary  assumption  is  that  the  net  lifetimes  of  the  failure 
and  censoring  distributions  are  independent.  Although  these  new  types  of  goodness-of-fit  proce¬ 
dures  focus  on  tests  of  composite  hypothesis,  many  new  tests  for  simple  or  composite  hypotheses 
may  be  constructed  similarly. 

Percentage  points  for  the  new  goodness-of-fit  tests  have  been  generated  via  Monte  Carlo 
methods  and  are  tabled  in  Sections  3.5  and  3.6.  Section  3.9  contains  power  studies  for  all  of  the 
new  tests  for  the  exponential  and  Weibull  distributions  against  several  alternative  distributions. 
The  new  tests  for  the  exponential  are  compared  to  existing  tests  introduced  by  Burke  [15]  and 
C.  H.  Chen  [21]  in  terms  of  power  against  the  selected  alternatives.  Results  and  observations  are 
summarized  in  Chapter  IV. 


8 


IL  Estimation 


2.1  Parametric  Estimation 

2.1.1  Maximum  Likelihood  using  the  Gensored~Data  Likelihood  Function.  If  we  assume 
that  the  distributional  family  of  a  random  variable  is  known,  we  can  then  estimate  the  parameters 
of  that  family  using  information  from  a  random  sample  of  observed  values  of  the  variable,  even  if 
they  have  been  randomly  censored.  For  distributions  in  which  the  survivor  function  has  a  closed 
form,  the  censored  data  maximum  likelihood  function  can  usually  be  used  to  estimate  parameters 
without  too  much  difficulty.  The  usual  procedure  in  doing  so,  as  in  the  complete  sample  case,  is  to 
maximize  the  likelihood  function  or  its  natural  logarithm  with  respect  to  the  unknown  parameters 
of  the  distribution.  That  is,  find  the  values  of  the  parameters  for  which  the  likelihood  function  is 
maximized. 

For  an  observed  random  sample  xi  =  min{ii,  Ci}J  =  1, . . .  ,n,  with  the  indicator  Si  =  Iiti<ci] 
in  the  competing  risks  model  of  random  censorship,  the  likelihood  function  can  be  expressed  as 

n 

L{XyS;6)  =  ■  Scixi)YVc{a^i)  ■ 

i=l 

where  0  is  a  vector  of  parameters.  However,  since  we  are  not  interested  in  estimating  the  parameters 
of  the  censoring  distribution,  the  censored-data  likelihood  function  can  effectively  be  written  as 

n 

L{x,S-,0)  =  l[fT{Xi,9y^STiXi,d)^-^^. 

In  cases  where  the  survivor  function  St{x)  does  not  have  a  closed  form  expression  the  com¬ 
plete  sample  likelihood  function  is  appreciably  simpler  than  the  censored-data  likelihood  function, 
such  as  with  the  normal,  lognormal,  and  gamma  distributions.  For  example,  the  EM  (Expectation- 
Maximization)  algorithm  of  Dempster,  Rubin,  and  Laird  [34]  may  be  used  to  find  maximum  likeli¬ 
hood  estimates  of  parameters  [28]. 


9 


Figures  13,  14,  and  15  in  Appendix  C  show  examples  of  maximum  likelihood  estimates  of  the 
distribution  functions  of  the  exponential,  Weibull  (shape  2),  and  Weibull  (shape  3.5)  distributions. 
The  choice  to  work  with  these  distributions  stems  from  their  wide  applicability  in  lifetime  data 
analysis  and  the  variety  in  density  shape.  The  exponential  distribution  is  heavily  skewed  to  the 
right  while  the  Weibull  with  shape  3.5  is  nearly  symmetric.  The  skewness  of  the  Weibull  with  shape 
2  is  roughly  in  between.  Examples  of  the  effects  of  varying  levels  of  random  censoring  can  be  seen 
in  the  figures  in  Appendix  C  that  have  been  generated  for  each  estimation  technique.  A  typical 
pattern  that  emerges  is  that  the  heavier  the  censoring  the  more  the  distribution  function  tends  to 
be  overestimated,  thus  the  survivor  function  is  underestimated.  Please  note,  however,  in  Figure  15 
that  although  the  best  estimate  in  the  plot  occurred  when  there  was  75%  expected  censoring,  this 
is  generally  not  the  case. 

2,1,2  New  Minimum  Distance  Methods  for  Randomly  Censored  Data.  In  this  section, 
we  extend  minimum  distance  estimation  in  such  a  way  as  to  include  the  estimation  of  param¬ 
eters  in  the  case  of  randomly  censored  samples.  Minimum  distance  estimation,  introduced  by 
Wolfowitz  [148, 149],  has  been  demonstrated  by  Parr  and  Schucany  [111],  Hobbs,  Moore,  and 
Miller  [64],  and  Gallagher  and  Moore  [52]  to  be  a  robust  estimation  technique  for  complete  sam¬ 
ples,  particularly  when  estimating  location  parameters.  For  the  3-parameter  Weibull  distribution, 
Gallagher  and  Moore  [52]  show  for  complete  samples  that  using  minimum  distance  to  estimate 
the  location  parameter  and  maximum  likelihood  for  shape  and  scale  parameters  produced  the  best 
overall  results.  In  comparing  several  methods  involving  distance  measures,  they  [52]  further  show 
for  the  Weibull  distribution  that  the  Anderson-Darling  goodness-of-fit  statistic  is  the  preferred  dis¬ 
tance  measure  with  the  Cramer- von  Mises  statistic  not  far  behind.  As  a  result,  minimum  distance 
estimation  using  the  Anderson-Darling  and  Cramer-von  Mises  statistics  may  be  used  to  estimate 
parameters  for  randomly  censored  samples  by  replacing  the  EDF  with  the  Kaplan-Meier  product- 
limit  estimator,  which  is  detailed  in  Section  2.2.1.  The  Kaplan-Meier  estimator  (KME)  is  relatively 


10 


easy  to  compute,  it  can  be  integrated  analytically,  and  it  reduces  to  the  EDF  in  the  case  of  no 


censoring. 

For  complete  samples  under  a  simple  hypothesis,  the  Cramer- von  Mises,  W^,  and  Anderson- 
Darling,  statistics  are  defined  as 


/+00 

[ 

-oo 


IFn(x)  -  Fo(x)fdFo(x) 


(1) 


and 


=  n 


lF„(x)  -  Fo(x)f 
Fo(x)[l  -  Fo(x)] 


dFo(x) 


(2) 


where  F„(x)  is  the  EDF  and  Fo(x)  is  the  distribution  of  the  random  variable  X.  Computing 
formulas  are  given  in  [32:  page  101].  The  EDF  Fn{x),  however,  is  not  available  for  randomly 
censored  samples  and  may  be  replaced  by  a  suitable  estimator  of  the  distribution  function,  such  as 
the  Kaplan-Meier.  The  modified  test  statistics  become 


W, 


/+00 

I 

-oo 


[F„(a:)  -  Fo{x)fdFo{x) 


and 


[Fr.ix)-Fo{x)]‘^ 

Fo(x)[l-Fo(x)] 


dFo{x). 


Upon  making  this  substitution,  new  computing  formulas  must  be  derived.  Making  use  of 
the  probability  integral  transform,  U  =  Fo{X),  and  letting  t(j),l  <  j  <  r,  denote  the  ordered 
uncensored  observations  where  r  is  the  number  of  failures  in  the  set,  we  obtain  the  transformed  set 
of  orderd  sample  values  U(j)  =  Fo{tQ)).  It  will  also  be  necessary  to  let  t(o)  =  -oo  and  t(r+i)  =  -t-oo, 
hence  U{o)  =  0  and  t/(r+i)  =  1.  Beginning  with  the  modified  Cramer-von  Mises  statistic 


we  make  use  of  the  probability  integral  transform,  the  fact  that  Fn{x)  -  Fo{x)  =  Fn{Qo{u))  ~  w, 
where  (5o(w)  =  —  ^3  assuming  F^^{u)  exists,  and  express  the  integral  as  the  sum  of 

integrals  over  the  ordered  set  of  failures  to  get 

^nn  =  nE/  [-Fn(Qo(«))-«Ntt. 
i=l 

Expanding  the  binomial  and  integrating  leads  to 

r+1  -j 

wlr.  =  -  K-d)  +  -M)  -  f^o-1)) 

j=l  ^ 

since  Fn(Qo(^t))  =  ^n(Qo(f^(j-i)))  =  is  a  constant  within  each  interval  {/(^-i)  to 

Simplifying  and  noting  that  Fn(i(o))  =  0  gives 

r+l 

Wlr.  =  «El^«(^(i-l))]'(^(i)  -  ^(i-1))  -  +  ? 

J=2  ^ 

which  can  then  be  written  as 

win  =  nE[^n(a;o))]2(C^O-+i)  -  I/O))  -  F„(a;o))(l/(%i)  -  U^))  +  (3) 

i=i 


The  computing  formula  for  the  modified  Anderson-Darling  statistic  is  obtained  similarly  as 
follows.  Beginning  with 


A 


2 

r,n 


Fo(a:)[l-Fo(a:)] 


dFo(a!) 


we  make  the  same  substitutions  as  with  above  and  express  the  integral  as  the  sum  of  integrals 
over  the  ordered  failure  set  to  get 


^2 


du. 


12 


Expanding  the  numerator  yields 


,  /■%. 
1=1 


Fn{Qo{u)f-2uFn{Qoiu))  +  U-^ 
u(l  -*  u) 


’  du 


which  can  be  written  as 


<.  =  ntr  {- 


{Q^{u)?  ^  Fn{Qo{u)f  _  2Fn{Qoiu))  ^  U  \ 


1  —  u 


1  —  u  1  —  u 


Completing  the  square  in 


.2  iFn{Qo{u)r  ,  FniQoiu)f  2Fn{Qo{u))  ,  1 _ 1,^1 

\  ^  1  —  w  1  —  u  1  —  u  1—u  1  —  uj 


du 


gives 


.2  f4(Qo(«))^ ,  [4(QoH)-ir 

fev.. 


—  1 1  du. 


Again,  noting  that  Fn{Qo{u))  ~  Fn(Qo{U(j-i)))  =  Fn(a:(j_i))  and  is  a  constant  within  each  interval 
U(j-i)  to  {/(j),  the  integration  gives 


r+l 


j=l 


Simplifying  and  distributing  in  Equation  4  yields 


Al^  =  -n  +  n{E;S[F»(aJi-i)]Mog(f/(,.))-E,-Ii[K(2Ji-i)]nog(Clo^ 

-  -  l]2log(l  -  Uu))  +  T%\[Fn{xj-,)  -  l]^  log(l  - 


(5) 


13 


Now,  since  log(t/(r+i))  =  0,F„(a:(o))  =  0,Fn(a;(r+i))  -1  =  0,  and  log(l  -U(o))  =  0,  Equation  5  can 
be  written  as 

r  r+l 

^r,n  =  +  log(f/(j)) -^[F„{a;j_i)]2log(J7(j_i)) 

i=i  j=2 

r  r+1 

-  -  1]2  log(l  -  [/(,))  +  X;[^n(x,-_i)  -  1?  log(l  -  ._!))} 

i=l  j=2 

which  simplifies  to 


3=1  3=1 

r  T 

-  Y^Fnixi-x)  -  iflogCl  -  U(i))  +  -  l]2log(l  -  t/o))}. 

i=i  i=i 

Combining  like  terms  finally  yields  the  computing  formula 


(6) 

-  {[4(a^0-l))  -  1]^  -  [Fn{x^^))  -  l]"}log(l  -  C/O)). 

Since  the  Kaplan-Meier  estimator  reduces  to  the  EDF,  both  of  the  modified  distance  measures 
given  above  reduce  to  their  complete  sample  counterparts  in  the  case  of  no  censoring. 


2.1.3  Estimating  the  Parameters  of  the  3-Parameter  WeibuU  distribution.  The  new 
minimum  distance  estimation  techniques  for  randomly  censored  samples  developed  in  Section  2.1.2 
is  applied  in  this  section  to  the  estimation  of  the  parameters  of  the  3-parameter  Weibull  distribution. 
Using  a  tandem  of  minimum  distance  estimation  and  maximum  likelihood  estimation  has  been 
shown  to  be  quite  effective  and  robust  in  the  estimation  of  the  parameters  of  the  3-parameter 
Weibull  distribution  for  complete  samples  [52,64].  The  success  of  a  1990  study  by  Gallagher  and 
Moore  [52]  using  minimum  distance  methods  to  estimate  the  location  parameter  and  then  maximum 
likelihood  to  estimate  the  shape  and  scale  given  the  location  parameter  prompted  this  study  for  a 


14 


randomly  censored  Weibull.  The  parameterization  of  the  Weibull  distribution  used  here  is 

F{x)  =  1  -  e“(“^ ),  X  >  >  0 


where  7,7;  and  /?  represent  the  location,  scale,  and  shape  parameters,  respectively.  In  the  current 
study,  our  estimation  procedure  is  as  follows: 

1.  obtain  an  initial  estimate  of  the  location  parameter  using  either  7*  =  0.999a;(i)  or 


r = ^(1)  - 


n(r  - 1)  ’ 


2.  use  ML  to  obtain  initial  estimates  of  shape  and  scale  parameters; 

3.  use  MDE  with  either  or  „  to  refine  the  location  parameter  estimate  given  the  shape 
and  scale  parameters; 

4.  re-estimate  the  shape  and  scale  using  ML  estimation  given  the  MD-refined  location  estimate. 

The  idea  of  using  an  initial  location  parameter  estimate  7*  =  0.999a:(i)  is  from  [5].  The  factor 
of  0.999  is  needed  simply  to  avoid  division  by  zero  in  the  Anderson-Darling  statistic.  The  idea  of 
using 

-Y*  -  a;,,-  _  sr=2(^i-^(l)) 

was  suggested  by  committee  member  Albert  H.  Moore  and  was  inspired  by  Kapur  and  Lamberson 
[71]  using  it  to  estimate  the  location  parameter  of  the  2-parameter  exponential  distribution. 

The  probability  density  function  of  the  3-parameter  Weibull  distribution  is 


15 


The  likelihood  function  for  an  observed  random  sample  Xi  —  minjiijCi},^  =  with  the 

indicator  Si  =  I[ti<ci]  is  given  by 


L(a;;7,A»7)  =  n|^ 


{xi  -  7)^  ^exp 


/  \  iSi  'I  r  r  /  \ 

fx  —  ')\  I  (x  —  'yy 

l~jJl  ni-l— JJ 


1  >1  l-Si 


The  natural  logarithm  of  the  likelihood  function  is 


log  L(a7;  7,  Si  log  ^  ^  Si/3  log  77  +  (/3  -  1)  ^  log(a;i  -  7)  “  ^  -  if 


i=l 


i=l 


i=l 


and,  letting  r  =  the  partial  derivatives  with  respect  to  the  parameters  are 


a  log  I,(a;;  7, 7?)  - -lY  ^ 


Er=i(^i-7)^  ’ 


(7) 


a  log  1,(3;;  7,  a,  7?)  _  rp  B 


dr} 


'  1=1 


(8) 


and 


aiogL(a;;7,a,7?) 

dP 


=  ^  -  r  log  7/  +  ^  <Si  log(a;i  -  7)  -  £  (^V^)  (^V^)  ' 


(9) 


The  necessary  and  sufficient  conditions  for  maximizing  the  likelihood  function,  or  its  natural  loga¬ 
rithm,  are  well  known.  In  order  to  find  maximum  likelihood  estimates  7,  and  it  is  necessary 
that  Equations  7,  8,  and  9  be  set  to  zero  and  solved  simultaneously.  Tha  sufficient  condition  is 
that  the  second  partial  derivatives  are  zero,  which  is  easily  verified.  Upon  setting  each  of  the  above 
equations  equal  to  zero,  it  is  possible  to  express  7}  in  terms  of  7  and  ^  as 


77  = 


(10) 


16 


but  there  are  no  explicit  solutions  for  7  or  /3.  In  fact,  the  maximum  likelihood  approach  to  finding 
a  parameter  estimate  for  7  often  fails  to  converge,  which  is  why  we  are  using  the  highly  reliable 
minimum  distance  method  to  estimate  7.  Nevertheless,  when  the  underlying  distribution  is  cor¬ 
rectly  specified,  maximum  likelihood  estimators  are  unbiased  and  minimum  variance  estimators 
and  so  we  continue  to  estimate  r]  and  /3  using  the  likelihood  equations.  Now,  the  expression  for  fj 
in  Equation  10  can  be  substituted  into  Equation  9  to  yield 


4  +  ^Ji(a;f-7) 

P  i=l 


r  ELi  -  7) 


0. 


(11) 


Once  an  initial  estimate  of  the  location  parameter  7  is  obtained,  Equations  10  and  11  are  solved 
numerically  for  fj  and  /3.  The  Newton-Raphson  procedure  was  used  for  this  purpose  throughout 
this  dissertation. 

A  Monte  Carlo  study  was  conducted  to  compare  the  four  variations  of  this  MD/ML  estima¬ 
tion  technique  for  the  3-parameter  Weibull  distribution.  The  censoring  distribution  in  each  case 
was  a  2-parameter  exponential  distribution  with  its  location  parameter  equal  to  that  of  the  under¬ 
lying  Weibull  distribution  and  scale  parameter  adjusted  to  give  the  desired  expected  proportion  of 
censoring  for  each  case.  The  estimation  procedure  is  location  and  scale  invariant,  as  discussed  in 
Section  3.4.2,  so  the  comparison  was  made  for  shape  parameters  2  and  3.5.  See  Table  1  for  the 
parameter  values  used  in  this  study.  The  variations  are  distinguished  by  the  choice  of  distance  esti¬ 
mator  and  initial  location  parameter  estimate  and  are  abbreviated  as  shown  in  Table  2.  Tabulated 
in  Tables  3  and  4  are  the  mean  and  standard  deviation  (shown  in  parentheses)  of  each  parameter  as 
well  as  the  integrated  squared  error  (ISE)  of  the  distribution  function  estimate.  ISE  is  traditionally 
defined  as  the  integrated  squared  difference  between  the  estimated  and  true  probability  density 
functions,  but  the  discrepancy  between  the  estimated  and  true  distribution  functions  will  be  used 
here  instead.  There  are  two  reasons  for  making  this  change.  First,  some  of  the  estimators  used 
in  the  case  of  randomly  censored  data,  such  as  the  Kaplan-Meier  estimator  and  the  mean  order 


17 


Table  1  Parameter  Values  of  the  3~parameter  Weibull  for  MISE  Comparison. 


Failure 

Distribution 

Parameter 

Values 

Censoring 

Distribution 

Parameter 

Values 

Expected 

Censoring 

Weibull 

^  =  2,»/  =  50,7  =  20 

Exponential 

61  =  148,7  =  20 

0  =  57.75,7  =  20 

0  =  25.7,7  =  20 

q  =  .25 

9  =  .50 
g  =  .75 

Weibull 

^  =  3.5,»?=50,7  =  20 

Exponential 

0  =  154,7  =  20 

0  =  62.5,7  =  20 

0  =  30,7  =  20 

9  =  .25 
q  =  .50 
q=.75 

number  estimator,  are  step  functions  of  the  distribution  functions  and  provide  no  estimate  of  the 
density  function.  Second,  the  Cramer-von  Mises  and  Anderson-Darling  goodness-of-fit  statistics 
use  the  distribution  function  rather  than  the  density  function  to  test  the  fit  of  a  hypothesized 
parametric  family. 

The  ISE  of  a  distribution  function  estimator  is  defined  here  as 


/+00 

[F(a;)  -  F{x)fdx 

-OO 

and  was  chosen  because  it  provides  a  good  overall  measure  of  the  closeness  of  an  estimator  to  the 
true  function.  In  the  comparison  procedure,  the  actual  integration  limits  used  were  max{7, 7}  for 
the  lower  limit  and  +  200  for  the  upper  limit.  Monte  Carlo  samples  of  size  N  =  1000  were 
used  to  compare  estimation  techniques  for  samples  of  size  n  =  20  and  60  with  expected  proportion 
of  censoring  set  at  g  =  0.25, 0.50,  and  0.75.  The  ISE  was  computed  for  each  of  the  1000  Monte 
Carlo  samples  and  the  mean  and  standard  deviation  was  found.  The  mean  and  standard  deviation 
of  each  parameter  estimate  were  also  determined  and  are  shown  in  Tables  3  and  4  along  with  the 
mean  and  standard  deviation  of  the  ISE. 

The  resulting  mean  and  standard  deviation  (shown  in  parentheses)  shown  in  Tables  3  and 
4  for  each  parameter  estimate  and  the  ISE  indicate  that  although  the  modified  Anderson-Darling 
statistic  performs  slightly  better  than  the  modified  Cramer-von  Mises  statistic  in  the  majority  of 
cases  according  to  MISE  criterion,  the  estimators  are  very  similar.  Furthermore,  the  choice  of  the 


18 


Table  2  Key  to  Abbreviations  of  MD/ML  Estimators  for  the  3~Parameter  Weibull  Distribution^ 


Abbreviation 

Description 

MDLCvMl 

Minimum  Distance  for  the  Location  using  the  Cramer- 
von  Mises  statistic  with  initial  estimate  7*  =  0.999x(i) 
and  ML  estimation  for  shape  and  scale  parameters 

MDLADl 

Minimum  Distance  for  the  Location  using  the  Anderson- 
Darling  statistic  with  initial  estimate  7*  =  0.999a;(i) 
and  ML  estimation  for  shape  and  scale  parameters 

MDLCvM2 

Minimum  Distance  for  the  Location  using  the  Cramer- 

von  Mises  statistic  with  initial  estimate  7*  = 
and  ML  estimation  for  shape  and  scale  parameters 

MDLAD2 

Minimum  Distance  for  the  Location  using  the  Anderson- 

Darling  statistic  with  initial  estimate  7*  =  a:(i)  -  '  ^n(r-i))  ^  ^ 
and  ML  estimation  for  shape  and  scale  parameters 

initial  location  parameter  estimate  does  not  seem  to  have  much  of  an  effect,  demonstrating  the 
robustness  of  MDE  even  when  samples  are  randomly  censored.  As  expected,  however,  parameter 
and  distribution  function  estimation  improves  with  increased  sample  size  but  is  increasingly  hin¬ 
dered  as  censoring  increases.  Another  important  result  found  in  Tables  3  and  4  is  the  effect  of 
censoring  on  the  parameter  estimates.  The  trend  in  the  parameter  estimation  seen  here  is  that 
location  parameter  estimates  become  increasingly  lower  while  shape  and  scale  parameter  estimates 
creep  ever  higher  as  the  expected  proportion  of  censoring  increases.  A  repetition  of  steps  3  and  4  of 
our  estimation  procedure  on  page  18  of  this  document  was  examined  for  the  3-parameter  Weibull 
distribution  with  shape  values  2  and  3.5  and  was  found  to  provide  no  additional  improvement  in 
estimation. 

Because  ML  estimates  are  asymptotically  normally  distributed,  the  mean  and  standard  de¬ 
viation  are  suitable  comparison  statistics.  In  contrast,  the  inherent  skewness  of  ISE,  see  Figure 
1  for  example,  makes  it  difficult  to  compare  these  estimators  based  on  the  mean  and  standard 
deviation  of  the  ISE  alone.  A  more  suitable  comparison  may  be  obtained  using  the  Kruskal- Wallis 
test  for  homogeneity,  a  nonparametric  procedure  used  to  determine  if  values  of  one  population 
are  systematically  larger  than  the  values  of  another  [85:  pp.  409-410].  Several  populations  may 
be  compared  simultaneously.  Kruskal- Wallis  test  results  and  side-by-side  boxplots  can  be  used  to 


19 


Table  3 


Expected 

Censoring 


q  =  0.25 


3-Parameter  Weibull,  shape  /3  =  2 
Mean  and  Standard  Deviation 


Sample 

Size 


n  =  20 


Method  of 
Estimation 


MDLCvMl 


MDLADl 


MDLCvM2 


MDLAD2 


n  =  60  MDLCvMl 


MDLADl 


MDLCvM2 


MDLAD2 


q  =  0.50  n  =  20  MDLCvMl 


MDLADl 


MDLCvM2 


MDLAD2 


n  =  60  MDLCvMl 


MDLADl 


MDLCvM2 


MDLAD2 


n  =  20  MDLCvMl 


MDLADl 


MDLCvM2 


MDLAD2 


MDLCvMl 


MDLADl 


MDLCvM2 


MDLAD2 


Location 
7  =  20 


24.50 

(4.12) 


24.42 

(3.96) 


22.83 

(4.58) 


22.27 

(2.24) 


21.78 

(1.84) 


21.85 

(1.75) 


21.29 

(2.02) 


21.15 

(1.82) 


21.48 

(3.47) 


21.58 

(3.26) 


19.03 

(4.12) 


18.80 

(3.71) 


20.49 

(1.09) 


20.60 

(0.96) 


18.77 

(1.35) 


19.75 

(1.09) 


16.04 

(7.11) 


16.30 

(6.98) 


12.58 

(7.15) 


12.53 

(6.91) 


18.25 

(3.39) 


18.44 

(3.15) 


1,  with  Exponential  Censoring, 
(in  par^lntheses) 


Scale  Shape  MISE  of  CDF 
»?  =  50  (5  —  2  (std.  dev.) 


44.29 

(7.89) 


44.52 

(7.60) 


46.40 

(8.28) 


47.17 

(7.97) 


47.85 

(4.39) 


47.81 

(4.29) 


48.45 

(4.53) 


48.66 

(4.36) 


48.35 

(9.64) 


48.09 

(9.49) 


49.48 

(4.90) 


49.34 

(4.85) 


50.27 

(5.07) 


50.27 

(4.97) 


Table  4  3-Parameter  Weibull,  shape  /?  =  3. 

Mean  and  Standard  Deviation 


Expected  Sample  Method  of 
Censoring  Size  Estimation 


q  =  0.25  n  =  20  MDLCvMl 


MDLADl 


MDLCvM2 


MDLAD2 


n  =  60  MDLCvMl 


MDLADl 


MDLCvM2 


MDLAD2 


n  =  20  MDLCvMl 


MDLADl 


MDLCvM2 


MDLAD2 


n  =  60  MDLCvMl 


MDLADl 


MDLCvM2 


MDLAD2 


q  =  0.75  n  =  20  MDLCvMl 


MDLADl 


MDLCvM2 


MDLAD2 


n  =  60  MDLCvMl 


MDLADl 


MDLCvM2 


Location 
7  =  20 

Scale 

7/ =  50 

26.41 

(5.96) 

42.83 

(7.38) 

26.47 

(5.84) 

42.79 

(7.19) 

24.70 

(6.55) 

44.66 

(7.97) 

24.45 

6.27) 

44.99 

(7.60) 

22.25 

(2.40) 

47.58 

(3.37) 

22.34 

(2.36) 

47.49 

(3.33) 

21.71 

(2.54) 

48.15 

(3.51) 

21.70 

(2.46) 

48.17 

(3.43) 

22.55 

(3.33) 

46.98 

(5.80) 

22.71 

(3.24) 

46.80 

(5.73) 

19.88 

(3.98) 

49.74 

(6.55) 

19.81 

(3.74) 

49.81 

(6.37) 

20.71 

(1.15) 

49.15 

(2.90) 

20.84 

(1.08) 

49.03 

(2.86) 

19.95 

(1.34) 

49.95 

(3.04) 

19.99 

(1.20) 

49.91 

(2.97) 

■■Iggni 

49.83 

(8.60) 

19.68 

(4.09) 

49.67 

(8.53) 

14.25 

(5.60) 

55.11 

(9.91) 

14.26 

(5.48) 

55.09 

(9.87) 

19.85 

(1.33) 

50.02 

(4.46) 

19.97 

(1.25) 

49.88 

(4.43) 

18.58 

(1.75) 

51.31 

(4.78) 

18.63 

(1.59) 

51.25 

(4.71) 

5,  with  Exponential  Censoring, 
(in  parantheses) 


Shape 
^  =  3.5 


MISE  of  CDF 
(std.  dev.) 


MDLAD2 


Kernel  estimate  of  1000  ISE  Values  for  the  MDLCvMI  Estimator,  Weibull  (2),  n=20,  q=0.25 


Figure  1  An  Example  of  the  Skewness  of  Integrated  Squared  Error  at  Sample  Size  20. 

statistically  discern  which  estimator  has  the  lowest  ISE  under  a  given  set  of  conditions  (i.e.  amount 
of  censoring,  sample  size,  and  underlying  distribution). 

For  each  Weibull  distribution,  level  of  censoring  and  sample  size,  Kruskal- Wallis  tests  were 
used  to  compare  Monte  Carlo  samples  of  1000  ISE  measurements  for  each  of  the  four  MD/ML 
estimation  techniques.  P-values  of  the  Kruskal- Wallis  tests  are  given  in  Table  5.  Because  a  higher 
p- value  indicates  stronger  evidence  supporting  homogeneous  populations,  there  is  little  doubt  that 
the  four  MD/ML  estimation  techniques  provide  statistically  equivalent  estimates  under  each  cir¬ 
cumstance,  with  the  exception  of  the  Weibull  with  shape  2  at  sample  size  20  with  25%  censoring. 
With  a  p-value  of  only  0.14,  which  is  displayed  in  boldface  type  in  Table  5,  further  investigation 
of  the  estimators  was  warranted.  Pairwise  comparisons  of  the  four  MD/ML  estimation  techniques 
at  n  =  20  and  q  =  0.25  for  the  Weibull  with  shape  2  revealed  one  significant  difference:  the  MD- 
LAD2  technique  had  significantly  lower  ISE  than  the  MDLCvMI  estimation  technique  (p-value  « 
0).  This  does  not,  however,  imply  that  the  ISE  of  MDLAD2  is  significantly  lower  than  the  ISE  of 


22 


Table  5  P- values  of  the  Kruskal-Wallis  Test  for  ISE  Comparison  of  MD/ML  Estimators. 

3-Parameter  Weibull  with  Exponential  Censoring 


n 

For  /?  =  2 

For  ^  =  3.5 

q  =  0.25  20 

0.14 

0.70 

60 

0.85 

1.00 

q  =  0.50  20 

0.59 

1.00 

60 

0.95 

1.00 

q  =  0.75  20 

0.96 

1.00 

60 

1.00 

1.00 

MDLCvM2  or  MDLADl,  nor  does  it  imply  that  the  ISE  of  MDLCvMl  is  significantly  higher  than 
the  ISE  of  MDLCvM2  or  MDLADl,  as  no  other  significant  differences  were  found  among  any  other 


pairwise  comparisons. 


2,2  Nonparametric  Estimation 

2,2,1  The  Kaplan-Meier  Product-Limit  Estimator,  One  of  the  earliest  works  to  recognize 
the  problem  of  statistical  analysis  in  the  presence  of  random  censorship,  particularly  as  it  relates 
to  the  theory  of  medical  follow-up  studies,  is  by  Harris,  Meier  and  Tukey  [60].  However,  actuarial 
life  table  estimation  of  a  distribution  function  from  variably  censored  observations  was  performed 
as  early  as  1912  using  arbitrary  grouping  intervals  [70].  In  1926  Greenwood  derived  the  variance 
of  the  life  table  estimate,  which  was  extended  in  1949  by  Irwin  to  obtain  an  estimated  covariance 
function  [70].  Kaplan  and  Meier  [70]  showed  in  their  famous  1958  paper  that  the  arbitrary  groupings 
were  unnecessary,  thus  introducing  the  product-limit  estimator,  which  is  probably  the  most  well- 
known  and  possibly  the  most  significant  contribution  to  the  field  regarding  randomly  censored  data. 
The  product-limit  estimator  is  thought  of  as  the  nonparametric  maximum  likelihood  estimator  of 
the  underlying  survivor  function  and  reduces  to  the  empirical  distribution  function  (EDF)  when 
no  censoring  occurs.  The  terms  “product-limit  estimator”  and  “Kaplan-Meier  estimator”  are  used 
interchangeably.  Efron  showed  the  product-limit  estimator  to  be  “self-consistent”,  among  other 
things,  in  this  often  referenced  contribution  to  the  1967  Berkeley  Symposium  on  Mathematical 
Statistics  and  Probability  [43]. 


23 


Many  authors  have  examined  the  properties  of  the  Kaplan-Meier  estimator  (KME).  Breslow 
and  Crowley  [13]  established  limiting  pointwise  normality  for  the  product-limit  estimate.  Confi¬ 
dence  bands  for  the  survival  probabilities  given  by  the  KME  have  been  explored  by  Thomas  and 
Grunkemeier  [134],  Gillespie  and  Fisher  [55]  and  Hall  and  Wellner  [59].  Meier  [97]  provides  proofs 
of  many  properties  of  the  KME,  including  the  Markovian  property.  Greenwood’s  asymptotic  vari¬ 
ance  formulation,  consistency  and  asymptotic  normality,  and  illustrates  the  parallel  to  properties 
of  the  EOF  for  complete  samples.  Peterson  [7]  demonstrates  the  utility  of  expressing  the  KME 
as  a  function  of  empirical  subsurvival  functions  by  using  them  to  give  an  easy  proof  of  the  strong 
consistency  of  the  KME.  Eubank  and  LaRiccia  [45]  used  a  regression  model  for  the  Kaplan-Meier 
quantile  process  to  construct  asymptotically  normal  and  efficient  estimators  of  location  and  scale 
parameters.  Chen,  Hollander  and  Langberg  [23]  address  small  sample  results  for  the  KME  under 
the  proportional  hazards  model  used  by  Koziol  and  Green  [81].  Miller  [99]  looked  at  the  asymptotic 
efficiency  of  the  KME  relative  to  the  maximum  likelihood  estimator  (MLE)  of  the  survival  func¬ 
tion.  The  KME  and  MLE  of  the  survival  function  are  compared  for  samples  containing  undetected 
outliers  by  Aranda-Ordaz  [105]. 

Mauro  [93]  considers  the  KME  from  a  combinatoric  viewpoint  and  Wellner  [142]  provides  an 
approximate  variance  formula  for  the  KME  which  is  compared  to  that  given  by  Chen,  Hollander  and 
Langberg  [23].  Wang  [139]  showed  the  uniform  consistency  of  the  KME  and  Kumazawa  [82]  derived 
a  consistent  estimator  of  the  variance  function,  which  is  dependent  on  the  censoring  distribution, 
of  a  life  expectancy  estimator  based  on  the  KME  given  by  Yang  [152].  Zhou  [153]  examines  the 
effects  of  non-i.i.d.  survival  and  censoring  times  on  the  KME.  Stute  and  Wang  [124]  prove  the 
strong  law  under  random  censorship,  which  can  be  used  to  show  consistency  of  many  estimators. 
Stute  [126]  shows  the  bias  of  Kaplan-Meier  integrals.  Stute  and  Wang  [126]  provide  a  jackknife 
estimate  of  the  Kaplan-Meier  integral  and  use  it  to  derive  formulas  for  mean  life,  higher  moments, 
and  mean  residual  life.  The  Kaplan-Meier  integral  is  proven  to  be  asymptotically  normal  when 
properly  standardized  by  Stute  [125]  using  the  Central  Limit  Theorem.  Csorgo  [30]  establishes 


24 


universal  Gaussian  approximations  for  empirical  cumulative  hazard  and  product-limit  processes 
under  random  censorship.  Van  Keilegom  and  Veraverbeke  [73]  investigated  almost  sure  convergence 
properties  of  the  KME. 

In  defining  the  Kaplan-Meier  product-limit  estimator,  we  consider  the  random  sample  Xi  = 
min(ti, Ci)^i  =  1, . . . ,n,  with  the  indicator  5i  =  I[ti<ci]  as  defined  in  Chapter  1.  The  product-limit 
estimator  of  the  distribution  function  is  defined  by 


Fn{x)  =  { 


0, 


X  <  a;(i) 


*<"(») 

t:xi<x 


1, 


X  >  X(^n) 


where  Ri  is  the  rank  of  the  pair  (xi^l  —  Si)  in  the  lexicographic  ordering  of  the  sequence  (a^i,  1  - 
Si),  (x2, 1  —  ^2)j  •  •  •  j  {xn,  Sn)^  Notice  that  this  formulation  treats  the  last  observation  as  a  failure 
regardless  of  whether  it  actually  is  or  not.  This  practice  has  been  adopted  by  many  authors  since 
the  estimator  was  first  put  forth  by  Kaplan  and  Meier  [70]  in  1958,  who  stated  at  the  time  that  the 
product-limit  estimator  should  not  be  used  beyond  X(^n)  if  ^(n)  corresponded  to  a  censoring  time. 
They  simply  say  that  Fn{x)  should  be  considered  to  fall  between  0  and  [70:  p.  463].  Plots 

of  examples  illustrating  the  Kaplan-Meier  estimator  can  be  found  in  Appendix  C. 


2,2.2  TM  Mean  Order  Number  Estimator.  The  idea  here  is  to  estimate  the  true  order 
number  of  each  observed  failure  since  the  order  number  it  is  assigned  in  the  randomly  censored 
sample  is  usually  not  the  order  number  it  would  have  assumed  had  the  censored  items  been  observ¬ 
able  until  failure.  To  estimate  the  true  order  number  consider  all  possible  combinations  of  censored 
and  failed  items  in  which  a  particular  failed  item  could  have  been  assigned  a  certain  order  number, 
then  use  the  number  of  ways  each  item  could  take  on  each  particular  order  number  to  find  the 
mean  order  number.  The  following  algorithm  is  outlined  in  Kapur  and  Lamberson  [71]  and  Sun 
and  Kececioglu  [127].  Let  Oi,i  =  l,...,r,  represent  the  mean  order  number  for  ordered  failure 


25 


number  i  and  iV+  denote  the  number  of  items  following  the  current  set  of  withdrawals.  Now,  if  the 
first  observation  in  the  full  randomly  censored  data  set  is  an  observed  failure,  then  Oi  =  1,  if  the 
second  observation  is  also  an  observed  failure  then  O2  —  2,  and  so  on  until  the  first  withdrawal. 
Once  censoring  times  have  been  encountered  in  the  ordered  set  of  failures  and  withdrawals,  the 
formula  [71]  for  computing  the  increment,  /,  needed  to  estimate  the  order  number  of  the  next 
failure  time  is 


r  _  n+l—Oj—i 

~  1+JV+  > 


where  Oo  =  0  if  the  first  item  in  the  set  is  censored.  Note  that  if  there  are  no  withdrawals  between 
the  {j  —  and  failures,  then  Ij  =  /j_i.  The  estimated  order  number  for  the  {j  -f  1)®^  failure 
is 

Oj+i  =  Oj  -b  Ij, 


From  the  estimated  order  numbers,  an  estimator  of  the  distribution  function  can  be  defined  as  [127] 


^MONBit)  =  \ 


0, 


t  <  t(0) 

<  t  <  =  l,...,r  +  1 


1,  t> 


where  Oo  =  0,  t(o)  =  0,  and  t(r+i)  is  a  pseudo-failure,  possibly  defined  to  be  -hoo,  or  somewhere 
in  between.  Note  that  other  plotting  positions  based  on  the  mean  order  numbers  may  be  used  to 
form  the  estimator.  Examples  of  the  mean  order  number  estimator  are  in  Appendix  C. 


2,2.3  The  Piecewise  Exponential  Estimator,  In  1983  Kitchen,  Langberg,  and  Proschan 
[76]  published  an  article  introducing  the  Piecewise  Exponential  Estimator  (PEXE).  This  continuous 
alternative  to  the  Kaplan-Meier  and  mean  order  number  estimators  is  shown  to  be  strongly  consis¬ 
tent  under  a  mild  regularity  condition  on  the  distribution  of  the  censoring  variable  and  its  errors  are 
said  to  follow  the  same  Gaussian  process  as  those  associated  with  the  Kaplan-Meier  estimator  [76]. 


26 


Further  discussions  of  the  properties  of  the  PEXE  as  well  as  a  comparison  to  Kaplan-Meier  estima¬ 
tor  can  be  found  in  [74]  and  [143].  The  development  and  rationale  for  the  PEXE  are  as  follows.  A 
set  of  n  items  are  placed  on  test  at  time  t  =  0  and  observed  until  failure  or  withdrawal,  whichever 
occurs  first.  Let  ti  represent  the  ordered  observed  failure  times,  Ci^j  represent  the  ordered  censoring 
times,  and  ki  be  the  number  of  censored  observations  between  consecutive  observed  failures.  Since 
both  failure  and  censoring  distributions  are  assumed  to  be  continuous,  no  ties  will  be  considered. 
Thus, 


0  <  ci,i  <  •••  <  Cl, ft,  <  ti  <C2,i  <  •••C2,ft2  <  ^2  <  ••• 

<  tr—l  K  Cy.,1  <C  *  •  *  ^  ^  ^  ‘  ^r-\-l,kr+i 


An  estimate  of  the  failure  rate  in  the  interval  U],  where  i  =  1, 2, . . . ,  r,  and  to  =  0,  is 
the  number  of  failures  observed  in  the  interval,  which  is  1,  divided  by  the  total  time  on  test  in  that 
interval.  This  procedure  estimates  a  constant  failure  rate 


^i-l)  +  *  +  1  *  {U  U-i) 

for  each  interval  between  successive  failures  which  is  then  used  to  fit  an  exponential  estimator  of  the 
survivor  function  on  each  interval.  The  separate  exponential  survivor  functions  are  pieced  together 
to  form  the  PEXE  of  the  survivor  function  up  to  the  failure.  Formally  then,  for  a  set  of  r 
ordered  failure  times  from  a  randomly  censored  sample  the  PEXE  of  a  survivor  function  is  defined 
as 


Spexe{x)  =  < 


exp(— iia:) 

exp{-[ziti Z2{x  ~  h)]} 

exp{-[iiti  +  ^2(^2  “  ii)  4-  •  * '  -f  Zi{x  -  ti-i)]} 

exp{-[ziti  -{-  Z2{t2  -h)  - h  Zr{x  -  tr)]} 


0<x<ti 
ti  <  X  <t2 

ti—\  ^  X  tij  %  —  3,  ...,7* 
X  >  tr* 


27 


Some  presentations  of  the  PEXE  do  not  define  an  estimate  beyond  U  [74]  while  Westberg  and 
Klefsjo  [143]  suggest  using  a  Weibull  distribution  with  shape  parameter  2  and  scale  parameter,/; 
estimated  by 

^  ir 

i - : 

y  -  log(5p£;x£;(^r)) 

if  there  is  any  indication  that  the  data  are  from  a  distribution  with  an  increasing  failure  rate.  A 
wise  choice  of  estimator  to  extrapolate  beyond  tr  is  given  by  Sun  and  Kececioglu  [127]  and  is  used 
in  the  definition  of  the  PEXE  given  above.  That  is,  from  tr  to  +oo  use  an  exponential  fit  with 
hazard  rate  equal  to  that  of  the  neighboring  interval  tr to  Examples  of  the  PEXE  are  given 
in  Appendix  C. 


2.2,4  Kernel  Estimators.  A  big  problem  with  the  nonparametric  distribution  function 
estimators  for  randomly  censored  samples  presented  so  far  is  their  lack  of  ability  to  extrapolate 
beyond  the  last  observed  failure.  The  kernel  estimators  are  better  in  this  area,  particularly  the  one 
proposed  by  Foldes,  Rejto,  and  Winter  [49]. 


Blum  and  Susarla  [11]  were  the  first  to  generalize  the  kernel-type  density  estimator  introduced 
by  Rosenblatt  in  1956  to  estimate  a  density  function  from  randomly  right-censored  data.  The 
estimator  is  defined  as 


fBSE{3^)  =  {Tlhn)  ^ 


H*(x) 


where 

n  -  + 11 

n-i?i  +  2 

is  a  modified  product-limit  estimator  of  the  censoring  distribution,  k{t)  is  a  kernel  function  satisfying 
the  requirements  of  a  probability  density  function,  and  is  a  positive  bandwidth  sequence  such 
that  limn-^oo^n  =  0.  We  selected  the  data-dependent  bandwidth  h  =  s*n~^  where  s*  is  the 
standard  deviation  of  the  failure  set.  The  standard  normal  kernel  was  used  throughout  this  work. 


Ki^)=  n 


28 


Naturally,  the  distribution  function  estimate  is  given  by 


/X 

oc 


fBSE{t)dt, 


Foldes,  Rejto,  and  Winter  [48,49]  establish  strong  consistency  of  kernel-type  estimators  using 
convergence  properties  of  the  Kaplan-Meier  product  limit  estimator.  They  defined  a  kernel-type 
density  estimator  that  was  very  similar  to  the  one  proposed  by  Blum  and  Susarla.  In  fact,  they 
are  asymptotically  equivalent  [108].  However,  the  Foldes,  Rejto,  and  Winter  (FRW)  estimator  has 
better  small  sample  properties  and  reduces  to  the  usual  Parzen-type  kernel  density  estimator  in 
the  case  of  no  censoring  while  the  Blum  and  Susarla  estimator  does  not.  The  FRW  kernel  density 
estimator  is  defined  as 

where  is  the  jump  of  the  Kaplan-Meier  estimator  from  Xi-i  to  Xi  and  hn  is  defined  as  it  is  in  the 
Blum-Susarla  estimator.  Illustrations  of  this  estimator  are  given  in  Appendix  C.  The  distribution 
function  estimate  is 

Ffrwe{x)  =  [  fFRWE{t)dt, 

J —oo 

Kernel  estimation  of  the  hazard  rate  function,  which  is  closely  related  to  the  density  function, 
is  considered  by  Tanner  and  Wong  [133].  Tanner  also  studied  the  effects  of  using  a  variable  kernel 
estimator,  in  which  the  bandwidth  is  allowed  to  vary,  and  presented  sufiicient  conditions  for  its 
strong  consistency  [132].  Yandell  [151]  derived  simultaneous  confidence  bands  for  kernel-type  esti¬ 
mators  and  extended  global  deviation  and  mean  square  deviation  results  to  the  case  of  randomly 
censored  survival  data.  Burke  and  Horvath  [17]  prove  various  properties,  such  as  strong  consis¬ 
tency,  of  density  and  failure  rate  estimators  based  on  the  Kaplan-Meier  estimator  in  a  competing 
risks  model  and  also  provide  limit  theorems  for  some  quadratic  functionals  of  each.  Padgett  and 


29 


McNichols  [109]  provide  a  summary  of  density  estimators  for  randomly  censored  data  in  which  they 
outline  all  of  the  estimators  developed  by  1984. 

Additionally,  McNichols  and  Padgett  [94]  introduced  an  adaptive  kernel  density  estimator 
for  randomly  right-censored  data  and  discussed  the  convergence  in  probability  and  the  almost  sure 
convergence  of  the  modified  estimator.  An  asymptotic  look  at  the  decomposition  of  integrated 
squared  error  into  variance  and  squared  bias  components  is  given  by  Marron  and  Padgett  [92]  in 
addition  to  a  data-based  method  of  selecting  an  asymptotically  optimal  bandwidth  for  kernel  density 
estimators  under  random  censorship.  Padgett  [108]  presents  and  summarizes  these  nonparametric 
density  estimators  as  well  as  histogram  estimators  for  randomly  censored  data. 

Although  kernel  density  estimation  is  computationally  intensive  and  somewhat  cumbersome, 
it  is  an  excellent  way  to  gain  insight  into  the  general  shapes  of  the  density,  distribution,  survivor, 
and  hazard  functions  of  the  true  underlying  process  outside  of  the  confines  of  a  parametric  model. 
It  is  also  a  very  good  way  to  get  an  impression  of  skewness  and  modality.  One  of  the  primary 
advantages  that  kernel  estimators  have  over  all  of  the  other  estimators  is  their  ability  to  provide 
smooth,  continuous  estimates  of  density  functions.  These  insights  can  be  a  valuable  starting  point 
for  choosing  an  appropriate  parametric  distribution  for  goodness-of-fit  testing. 

2.2,5  New  Trigonometrically-Smoothed  and  Jackknifed  Estimators  for  Randomly  Censored 
Data.  In  1982  James  Sweeder  developed  a  class  of  trigonometrically-smoothed  and  jackknifed 
estimators  of  distribution  and  density  functions  based  on  the  EDF  under  the  supervision  of  Dr. 
Albert  H.  Moore  in  a  doctoral  dissertation  for  the  Air  Force  Institute  of  Technology  [131].  This 
class  of  estimators  not  only  provides  continuous  estimates  of  distribution  functions  but  can  be 
differentiated  to  give  estimates  of  density  functions  as  well.  Furthermore,  Sweeder’s  estimators 
were  shown  to  outperform  the  EDF  according  to  MISE  criterion,  provide  empirical  convergence 
rates  that  are  quite  close  to  those  of  other  methods  of  estimation  for  complete  samples,  and  work 
well  for  relatively  small  sample  sizes.  Sweeder’s  estimator  can  be  generalized  to  estimate  both 


30 


distribution  and  density  functions  for  randomly  censored  data  by  substituting  the  Kapl^-Meier 
estimator  or  the  mean  order  number  estimator  in  place  of  the  EDF. 

Analogous  to  Sweeder’s  estimator,  we  begin  with  a  trigonometrically  smoothed  estimate  of 
the  distribution  function  making  use  of  the  cosine  function  as  follows 


Fn{x)  =  { 


0, 

Gi 


1, 


Gi^i  —  Gi 


-  l-COSTT  - - ^ 

L  -  ^{i)/ 


X  <  X(o) 

<x  <  =  0,  ...,n 

^  >  ^(n+l) 


where  i  =  1, . . . ,  n  represents  a  plotting  position  for  the  ordered  observation  obtained  via  the 
Kaplan-Meier  or  the  mean  order  number  estimator  and  extrapolation  points  xo  and  x^-ii  defined 
such  that  Go  =  0  and  Gn+i  =  1-  Differentiating  leads  to  the  density  function  estimator 


fn{^)  =  { 


TT 

2X 


Gi+i  “  Gi 


(i+l) 


-X, 


ii) 


sinTT  - - ^ 

VA(i+i)  - 


0, 


A^(i)  ^  X  K.  ,2  —  0, . . . ,  71 

elsewhere. 


(12) 


The  trigonometric  smoothing  creates  a  continuous  estimate  of  the  distribution  function,  but 
the  density  estimate  still  has  problems  because  the  derivative  of  the  distribution  function  is  zero 
at  the  observed  failure  points.  This  is  where  the  jackknife  comes  in.  Sweeder  [131]  uses  a  method 
proposed  for  bias  reduction  by  Quenouille  [113]  to  sidestep  the  zero  derivative  points.  This  variation 
of  the  jackknife  approach  involves  the  division  of  the  sample  distribution  function  into  subsamples. 
This  procedure  is  outlined  by  Sweeder  [131:  pp.  19-20]  for  complete  samples. 

For  randomly  censored  samples,  obtaining  the  subsamples  can  be  achieved  as  follows.  Let 
i(i) , . , . ,  be  the  set  of  ordered  failure  times  from  a  randomly  censored  sample.  The  number  of 
subsamples  chosen  is  fc  <  ^.  Let  £  =  1, . . . ,  A:  index  the  subsamples  and  let  be  the  element 
of  subsample  i.  Now,  with  k  subsamples,  r  =  km  +  R  where  m  =  and  R  —  n  mod  fe.  The 


31 


assignments  are  made  to  the  subsamples  as 


where 

j  =  1, . . . ,  m  if  ^  >  iZ 
j  =  1, .. .  ,m  +  1  if  ^  <  i?. 

This  creates  k  ordered  subsamples,  R  of  size  m  +  1  and  k  —  Rof  size  m.  Let 

{m  if  i  >  R 
m  +  1  if  £  <  R, 

In  practice,  extrapolation  points  for  each  subsample  can  be  obtained  by  assigning 

^(0,^)  =  niax{2y(i^^)  -  y(2,^),^(i)} 


and 


5^(n»+i,«)  =  max{2y(„.,^)  - 


Using  each  of  the  k  subsamples,  we  construct  continuous,  differentiable  estimates  of  the 
distribution  function  through  equation  5.  Finally,  the  sample  distribution  function  estimator  is 


Kix)  =  < 


0,  x<  X(1) 

jfe  S<=1  ^t,n*  -^(1)  ^  X  <. 

1,  2!  >  X^n)’ 


(13) 


32 


Similarly,  the  density  estimator  is  defined  as 


/:(^)  =  < 


0, 


^(1)  <  a;  <  X(n) 

elsewhere. 


Plots  illustrating  the  use  of  this  estimator  with  and  without  the  jackknifing  are  given  in  Appendix 

C. 


2.3  Semi-Parametric  Estimation 

2.3.1  The  Klein,  Lee,  and  Moeschberger  Partially  Parametric  Estimator.  In  1990  Klein, 
Lee  and  Moeschberger  [78]  introduced  an  estimator  of  the  survivor  function  (and  hence  the  distri¬ 
bution  function)  for  randomly  censored  data.  Here  it  will  be  refered  to  as  the  KLM  estimator.  This 
estimator  is  relatively  easy  to  construct  and  compares  favorably  to  the  completely  nonparametric 
estimators.  For  the  parametric  part  of  this  partially  parametric  estimator,  we  let  R{x  \  6)  denote 
a  parametric  survivor  function  where  0  is  a  vector  of  unknown  parameters  estimated  by  some  con¬ 
sistent  estimator  6  based  on  the  observed  data.  One  could  inspect  a  kernel  density  estimate  or  a 
product-limit  estimate  of  the  function  or  perhaps  use  some  prior  knowledge  of  a  process  to  select 
a  reasonable  parametric  model. 

Recall  that  in  the  competing  risks  model  of  random  censorship  Xi  =  mm{ti,Ci}  and  Si  = 
Iti<ci-  For  some  arbitrary  value  x,  there  are  four  distinct  possibilities:  (1)  the  observed  value  xi  is 
greater  than  x  and  represents  an  observed  failure,  (2)  the  observed  value  of  Xi  is  greater  than  x  and 
corresponds  to  a  censoring  time,  (3)  xi  is  less  than  x  and  represents  a  failure,  and  (4)  Xi  is  less  than 
X  and  represents  a  censored  observation.  The  scenarios  represented  by  cases  (1),  (2),  and  (3)  can 
all  be  handled  nonparametrically.  However,  case  (4)  presents  the  need  for  some  way  to  estimate  the 
probability  that  an  item  whose  lifetime  will  ultimately  be  censored  would  have  survived  beyond  x. 
The  parametric  model  is  used  for  this.  With  that  in  mind,  the  KLM  partially  parametric  estimator 


33 


of  a  survivor  function  of  a  randomly  censored  sample  of  size  n  is  given  by 


Sklm{x) 


n 


where 


Mx  I  iZ(-  \6))  =  { 


0 

R{x  I  d) 
R{Xi 


]£  Xi>  X  and  =  0  or  1 
if  Xi  <x  and  Si  =  1 
0)  li  Xi  <x  and  Si  =  0. 


This  estimator  reduces  to  the  usual  complete  sample  EDF  in  the  case  of  no  censoring.  Further, 
it  can  be  seen  that  the  greater  the  proportion  of  censoring,  the  more  the  estimator  will  resemble 
the  parametric  estimator  R{x  \  6).  Examples  can  be  found  in  Appendix  C.  In  fact,  when  R{x  \  6) 
is  chosen  correctly,  the  estimator  converges  uniformly  if  the  parameter  estimates  converge,  such 
as  when  maximum  likelihood  is  used  [78].  Klein,  Lee,  and  Moeschberger  [78]  also  state  that  the 
process  y/n{SKLM{oi^)  ~  5'(3^))  converges  weakly  to  a  zero  mean  Gaussion  process.  However,  the 
covariance  of  this  process  is  very  difficult  to  determine  for  many  distributions.  They  [78]  were  able 
to  derive  the  covariance  function  in  the  case  of  an  exponential  failure  distribution  subject  to  an 
exponential  censoring  distribution  and  found  that  the  asymptotic  variance  of  the  KLM  estimator 
is  always  smaller  than  that  of  the  Kaplan-Meier  estimator,  assuming  the  parametric  part  of  the 
model  is  chosen  correctly. 


2.3.2  A  Semi-Parametric  Kaplan-Meier  Estimator.  Since  the  Kaplan-Meier  estimator 
provides  no  estimate  of  a  survivor  function  beyond  the  maximum  observation  when  it  corresponds  to 
a  censoring  time,  some  authors  [77,117]  use  a  parametric  distribution  for  this  purpose.  Although  any 
appropriate  distribution  may  be  chosen  for  a  given  situation,  the  Weibull  distribution  is  generally 
accepted  as  providing  a  reasonable  fit  to  lifetime  data.  For  the  Kaplan-Meier  estimator,  if  the 
maximum  observation  is  an  observed  failure  time,  then  5n  =  0  for  a;  >  X(^j^y  In  1998,  Ruffiin  [117] 


34 


used  the  two-parameter  Weibull  distribution  to  estimate  S{x)  for  x  >  when  a?(n)  represents  a 
censoring  time,  using  maximum  likelihood  to  provide  parameter  estimates.  Klein  and  Moeschberger 
[77]  used  the  same  approach  but  found  that  tying  the  estimators  together  at  by  maximizing 
the  likelihood  function  with  respect  to  the  parameters  of  the  two-parameter  Weibull  distribution 
subject  to  the  constraint 

e  '  ”  '  =5„(a:(„)) 

where  7]  and  /3  are  the  scale  and  shape,  respectively.  This  practice  will  also  guarantee  a  monotonic 
estimate,  whereas  the  “untied”  version  may  not.  Examples  are  given  in  Appendix  C. 

2.4  A  Comparison  of  Distribution  Function  Estimators 

A  comparison  of  estimation  techniques  was  conducted  using  the  mean  integrated  squared 
error  (MISE)  as  well  as  nonparametric  Kruskal- Wallis  tests  on  integrated  squared  error  (ISE)  of 
the  estimated  cumulative  distribution  function  (CDF)  for  each.  MISE  is  a  measure  of  discrepancy 
between  the  true  CDF  and  the  estimated  CDF  and  so  an  estimator  yielding  a  smaller  value  is 
a  better  estimator  under  the  given  circumstances.  Furthermore,  the  standard  deviation  (shown 
in  parentheses)  of  the  integrated  squared  error  was  obtained  to  give  some  idea  of  the  precision 
of  the  Monte  Carlo  simulation.  1000  samples  of  sizes  20  and  60  were  evaluated  at  0.25,  0.50, 
and  0.75  expected  proportion  of  censoring.  Numerical  integration  was  performed  using  Simpson’s 
rule  with  1001  function  evaluations.  For  the  exponential  and  2-parameter  Weibull  distributions, 
a  lower  limit  of  0  and  an  upper  limit  of  X(^n)  +  200  was  used.  For  the  3-parameter  Weibull  the 
lower  limit  of  integration  was  chosen  to  be  the  maximum  of  the  true  location  parameter  value  and 
its  estimate.  The  upper  limit  was  X(^ri)  ^^ch  sample.  Since  distribution  function  estimation 
beyond  presents  difficulties  for  some  estimators,  x^^n'^  -h  200  seemed  an  appropriate  choice  for 
the  upper  integration  limit  with  the  parameter  values  used  here  to  compare  each  estimator’s  ability 
to  extrapolate  beyond  x^n)  the  case  of  randomly  censored  data. 


35 


The  study  was  conducted  for  the  exponential  distribution,  two  cases  of  the  2-parameter 
Weibull  (location  known)  with  shape  parameter  values  2  and  3.5,  and  two  cases  of  the  3-parameter 
Weibull  also  with  shape  parameter  values  2  and  3.5.  The  censoring  distribution  in  all  cases  was  cho¬ 
sen  to  be  an  exponential  distribution  with  its  parameters  adjusted  to  provide  the  desired  expected 
proportion  of  censoring.  Table  4  shows  the  parameter  combinations  used  for  each  case.  All  of 


Tab 

e  6  Parameter  Values  for  MISE  Comparison. 

Failure 

Distribution 

Parameter 

Values 

Censoring 

Distribution 

Parameter 

Values 

Expected 

Censoring 

Exponential 

77  =  50 

Exponential 

0  =  150 
e  =  5o 

0  =  16| 

q=.25 
9=  .50 
9=  .75 

Weibull 

/3  =  2,j/  =  50 

Exponential 

e  =  148 

e  =  57.75 

e  =  25.7 

q=.25 
q=.50 
9=  .75 

Weibull 

I3  =  3.5,t]  =  50 

Exponential 

0  =  154 
e  =  62.5 

0  =  30 

9  =  .25 
q=.50 
q^.75 

Table  7  Key  to  Abbreviations  of  Estimators. 


Abbreviation 

Description 

MLE 

Maximum  Likelihood  Estimation 

KME 

Kaplan-Meier  Estimation 

TS-KME 

Trigonometrically-Smoothed  KME 

TSJ-KME 

Trigonometrically-Smoothed  and  Jackknifed  KME 

MONE 

Mean  Order  Number  Estimation 

PEXE 

Piecewise  Exponential  Estimation 

FRWE 

Foldes,  Rejto,  and  Winter  Kernel  Estimation 

BSKE 

Blum  and  Susarla  Kernel  Estimation 

KLME 

Klein,  Lee  and  Moeschberger  Partially  Parametric  Estimation 

SP-KME 

Semi-Parametric  KME 

the  nonparametric  estimators  are  location  invariant  provided  the  censoring  distribution  makes  the 
same  location  shift,  therefore,  the  MISE  for  the  nonparametric  methods  is  the  same  regardless  of 
whether  a  location  parameter  must  be  estimated  or  not.  As  expected,  however,  the  MISE  for  the 
Weibull  distribution  when  using  parametric  estimation  is  slightly  higher  when  a  location  parameter 
must  be  estimated,  but  is  still  better  than  the  nonparametric  methods  when  the  underlying  family 


36 


b£) 

fl 

■§ 

a 

CD 

o 


13  §3 

cfi 


CD 

a 

o 


a 

^  .2 

*fe  ^ 

-P  OT 

S  g 

a  P 

S. 

^  > 

.. 

CO  >H 
I— I  c^ 

S-S 


00 

CD 

I 


cn 


CJ 

irH 

CO  t-H 

O  CO 

C55  00 

b-  xt< 

O  OO 

05  CM 

o  lO 

^  o 

xt<  00 

CO  Xt< 

CO  b- 

CO  b- 

c*^ 

O  00 

CM  CM 

CO  05 

00  CM 

kO  o 

CM  CO 

QJ 

CO  lo 

CM 

IV 

kO  kO 

O  CO 

05  O 

g 

Ph 

tH  rH 

CM 

00  CM 

d  d 

iH  1 — 1 

CM  CO 

<3 

CD 

^ ^ 

— ' 

C3 

o 

CO  C55 

b-  b- 

iH  CM 

kC  05 

05  <05 

fi 

T— 1  lO 

CO 

CO  05 

xt<  CO 

iH  b- 

O  CO 

Cfj 

^  a> 

l>  00 

CO  CO 

lO  kO 

tH  xfl 

b-  OO 

iJ 

CO  (M 

tH 

«  00 

XjH  X}< 

b-  05 

x}^  cq 

t-H  i-H 

CM  CO 

00  CM 

d  d 

d  d 

CM  kO 

s  X 

pq 

i>  oT 

s  ^ 

So 

x^^  OO 

Xt< 

a>  00 

O  00 

i  t  ^ 

05  b- 

Xt<  kO 

CO  <M 

CM  CO 

CO  kO 

5  ^ 

cn 

o  CO 

O 

05  b- 

j  cr> 

CQ 

CO  ^ 

J3  CM 

?2  S 

t-H  rH 

CO  00 

CO  ^ 
kO  ^ 

N  X 

'  ^ 

_ 

_ 

rrl 

rH  O 

00  CM 

CO  tH 

O  x}< 

rH  CO 

CM  b^ 

1>  kO 

CO  kO 

CO  05 

Xt<  00 

1>  kO 

b-  o 

[^ 

CM 

CO  l>- 

xt<  O 

CO  tH 

CO  CM 

CM  CO 

^  r-4 

kO  CO 

00  CO 

kO  Xl^ 

tH  b- 

CM  Xf 

rH  rH 

CM  iH 

CO  CM 

o  d 

tH  d> 

xf  t-H 

^ 

'■ — ^ 

^ ' 

X— ^ 

^ ^ 

x-^ 

^ ^ 

^ 

pq 

^  Oi 

i>-  CO 

CM  C35 

b-  xl< 

00  00 

C?5  Xf 

(M  CO 

CO  tH 

O  O 

05  CO 

kO  tH 

00  b- 

n 

CM  O 

1>-  CM 

CO  Xt< 

xt<  CO 

Xf  iH 

xf  CM 

m 

kO  ^ 

00  b- 

O  O 

kO  xf 

iH  kO 

CO  CO 

Ph 

r-i  rH 

CM  Xt^ 

kO  CO 

d  d 

r-i  tH 

CO  xf 

CJ 

’5 

X— V 

X— S 

_ 

^ 

pel 

00  00 

CO  xJ< 

CM  00 

00  05 

b-  O 

c? 

1^: 

1> 

JN-  CO 

O 

kO  00 

kO  b- 

fc 

kO  CO 

CO 

§  o 

kO  CM 

kO  00 

kO  05 

g 

O 

0>  CO 

kO 

o  ^ 

CO  kO 

kO  tH 

O  CO 

rH  rH 

CO  CM 

2  ^ 

o  d 

t-i  r-4 

CO  CM 

N - ✓ 

V-xX 

s 

o 

:5; 

w 

^ 

^ 

^ 

^ 

^ 

^ 

CO  O 

xt<  b- 

05  tH 

CO  kO 

^  o 

00  o 

OO  CM 

x^^  <35 

tH  CM 

b-  CM 

05  tH 

O  00 

o 

xj<  iiO 

CM  CO 

b-  t- 

b-  o 

1 

oq 

]>.  !>. 

xh  xj^ 

CO  xt< 

t-H  1>- 

oq  kc 

zn 

irH  1—1 

CM  ^ 

cc5  ^ 

d  3 

^  Sx 

CO  iH 

H 

PP 

1— 1  00 

CO  o' 

b- 

xt<  o' 

CO  P 

o  'S' 

CO  05 

00  iH 

CO  kO 

05  O 

tH  g> 

o>  ]> 

O  CO 

S  rH 

b-  CM 

tH  O 

1 

kO  rH 

cq  cq 

W  kO 

kO  x}< 

^  05 

cq 

cn 

rH  rH 

CO  CM 

2  2ix 

<35  ^ 

rH  d^ 

c6  CM 

H 

'■  ^ 

^ ^ 

x^ 

x"^ 

x^ 

^ 

r  1 

00 

b-  CO 

iH  CO 

b-  00 

CO  O 

CO  O 

CO  CM 

CO  CO 

CM  tH 

CO  b- 

O  CM 

05 

CM  O 

xt<  CO 

CM  kO 

r>  xf< 

kO  00 

CO  CM 

K>< 

C5  CO 

^  b^ 

<35  cq 

cq  xi^ 

xJ^  b^ 

oq  xf 

Pm 

CM  t-H 

CO  t-H 

b-^  CM 

d  ^ 

^  2x 

tH 

'■ — ^ 

'■ — ^ 

^ 

'■ — ^ 

‘i:! 

.X— N 

-w 

PP 

CM  T}< 

CO  CO 

o  CO 

00  b- 

CM  05 

kO  <35 

rH  CO 

CO  CO 

CO  05 

X^^  CM 

O  CO 

Xf  O 

g 

rH  CO 

b^  CM 

CM  05 

QO  CM 

kO  00 

O  05 

<3 

oq  tH 

cq  ko 

ko  cq 

CM  Xt< 

xt<  CO 

P  xf 

t. 

o 

xt< 

d  ^ 

C5  ^ 

r-i  CM 

kC 

o 

kO 

kO 

O 

lO 

CM 

kO 

b- 

CM 

kO 

b- 

O 

o 

d 

d 

d 

d 

II 

II 

II 

II 

II 

11 

<5^ 

C3i 

C5< 

Cjt 

Cj< 

o 

o 

CM 

CO 

II 

II 

?2 

37 


bX) 

d 

u 

§ 

o 


I 

o 

&  w 

r3 

_  c; 

-2:3 

g 

CM  ^ 

§■•3 

^  d 

w  g 

o 

'd 

I « 
^•l 

u  .SS 
S  > 

'S 


i  'B 

cJi  i 
w  ^ 

m 


a> 

<D 


Semi-parametric 

SP-KME 

0.8014 

(0.6146) 

1.2305 

(1.2057) 

2.2470 

(2.5209) 

0.2782 

(0.2244) 

0.4272 

(0.3737) 

1.0189 

(1.2686) 

KLME 

0.6911 

(0.5847) 

0.9285 

(1.0255) 

1.6210 

(1.9378) 

0.2320 

(0.2106) 

0.2960 

(0.3185) 

0.6419 

(0.9139) 

Nonparametric 

BSKE 

1.1738 

(1.4169) 

5.1860 

(8.0251) 

26.7995 

(26.494) 

0.3020 

(0.2810) 

1.2848 

(1.5940) 

12.1120 

(13.240) 

FRWE 

0.6525 

(0.6400) 

0.9503 

(0.9192) 

2.2049 

(1.7738) 

0.2280 

(0.2098) 

0.3521 

(0.3422) 

0.9546 

(0.7999) 

PEXE 

0.7312 

(0.6316) 

1.3348 

(2.1162) 

4.0261 

(7.2607) 

0.2635 

(0.2182) 

0.4097 

(0.3666) 

1.3375 

(2.5702) 

MONE 

0.8691 

(0.6962) 

1.5952 

(1.5747) 

4.1457 

(3.4197) 

0.2870 

(0.2345) 

0.4930 

(0.4046) 

1.4965 

(1.1604) 

TSJ-KME 

1.1414 

(0.9573) 

1.5575 

(1.2258) 

2.9587 

(1.9630) 

0.3239 

(0.2626) 

0.4816 

(0.3672) 

1.1466 

(0.7908) 

TS-KME 

0.7772 

(0.6565) 

1.3414 

(1.1679) 

4.4250 

(3.7516) 

0.2694 

(0.2150) 

0.4238 

(0.3349) 

1.3799 

(1.1243) 

KME 

1.0362 

(0.8034) 

1.6501 

(1.1619) 

3.6714 

(2.1618) 

0.3120 

(0.2312) 

0.5169 

(0.3805) 

1.4160 

(0.8854) 

Parametric 

MLE 

0.5374 

(0.5899) 

0.8353 

(1.0528) 

1.6297 

(2.1278) 

0.1750 

(0.2035) 

0.2601 

(0.3195) 

0.6352 

(0.9155) 

lO 

CM 

O 

II 

Cr 

O 

CM 

II 

d 

o 

o 

II 

Cr* 

q  =  0.75 

lO 

CM 

d 

II 

o 

CO 

II 

d 

q  =  0.50 

q  =  0.75 

38 


1^ 

x* - ^ 

o 

CO 

lO 

<35 

oo 

CO 

CM 

CO 

rH 

CM 

<35 

CO 

*«>«> 

\o 

05 

00 

<35 

1—1 

CO 

iH 

CO 

a) 

<35 

CM 

o 

Of) 

Trt^ 

1— t 

t- 

o- 

CO 

to 

o 

CO 

CM 

1 

lO 

CO 

J> 

CO 

CO 

CM 

rH 

rH 

CM 

CM 

to 

to 

Pu, 

o 

O 

O 

o 

1 — 1 

1-H 

O 

O 

o 

O 

O 

b 

5 

CO 

Pc^ 

CO 

CM 

o 

o 

to 

1-H 

CO 

o 

rH 

CO 

CO 

C 

I— 1 

00 

o 

rH 

to 

o 

rH 

00 

CO 

O 

ft 

1> 

1> 

to 

CO 

00 

CO 

00 

00 

CO 

Co 

h-l 

CO 

to 

to 

<35 

o 

rH 

rH 

rH 

iH 

CO 

1:^ 

o 

O 

o 

o 

o 

1-H 

o 

o 

o 

b 

O 

b 

'  ^ 

_ 

_ _ ^ 

“ 

00 

CM 

to 

o 

o 

rH 

i>- 

CO 

CO 

to 

lO 

CM 

o 

1 — 1 

CO 

rH 

00 

CM 

o 

A 

CO 

00 

00 

CM 

l>- 

i> 

CO 

rH 

to 

CO 

CO 

CO 

00 

CO 

CM 

'  1 

to 

rH 

rH 

to 

to 

CO 

o 

p:^ 

o 

b 

CM 

rH 

o 

o 

O 

b 

b 

x« - ^ 

CO 

00 

o 

CO 

JO 

o 

rH 

to 

CO 

CO 

rH 

o 

o 

CM 

to 

b- 

CO 

1^- 

o 

CO 

o 

iv 

to 

<35 

i-H 

I-H 

Pd 

to 

cp 

rH 

t-H 

rH 

CM 

b 

b 

o 

o 

1 — 1 

i-i 

b 

b 

d 

o 

b 

o 

'  ' 

'  ^ 

’ 

'  ^ 

^ 

_ _ ^ 

_ _ 

^ 

_ _ ^ 

(jq 

o 

05 

O 

00 

o 

o 

1>- 

CO 

CM 

CO 

00 

CO 

00 

1 — 1 

CO 

iH 

<35 

o 

1>- 

»><i 

lO 

Of) 

<35 

<35 

CO 

CM 

CO 

<35 

CM 

o 

CO 

CO 

CO 

CO 

to 

iH 

rH 

CM 

i-H 

to 

CO 

Ph 

o 

O 

o 

o 

CM 

1-H 

o 

o 

o 

o 

o 

o 

_ _ ^ 

_ _ ^ 

_ _ 

_ 

PP 

05 

00 

o 

CO 

to 

CO 

rH 

<35 

CO 

J>- 

to 

CM 

rH 

CM 

CO 

00 

O 

CO 

tv 

5 

tH 

C55 

00 

CO 

CO 

JO* 

CO 

CO 

tH 

CO 

o 

lO 

CO 

00 

!>• 

i-H 

o 

1 — 1 

rH 

CM 

CM 

CO 

to 

C3 

:§ 

b 

b 

o 

b 

CM 

CM 

o 

o 

O 

O 

o 

o 

' 

g 

<5 

:2; 

w 

^ 

05 

CO 

CO 

1— t 

CO 

iH 

CO 

<35 

CM 

CO 

CO 

CO 

CM 

CO 

CO 

<35 

iH 

tH 

o 

lO 

CM 

1> 

00 

CO 

00 

O 

t- 

o 

to 

to 

CO 

00 

i>- 

1-H 

<35 

1— 1 

cp 

CM 

iH 

CO 

CM 

CO 

to 

CO 

b 

T— 1 

CM 

iH 

b 

b 

b 

s 

H 

pq 

V-H 

CO 

c? 

<35 

oT 

CO 

tr? 

oT 

CO 

o' 

<35 

oT 

00 

<35 

o 

t~H 

CO 

CO 

rH 

CM 

to 

CO 

o 

05 

CM 

o 

l> 

<35 

CO 

CO 

<35 

to 

00 

h*H 

1 

^- 

!>. 

CM 

rH 

rH 

CM 

rH 

to 

CO 

d 

s 

d 

CM 

CM 

d 

s 

b 

b 

H 

V.— 

^ 

^ 

^ 

^ 

^ 

^ 

r-rl 

iO 

CM 

1>- 

1-H 

CM 

CO 

CM 

CO 

to 

o 

00 

W 

00 

<35 

to 

rH 

CM 

00 

CM 

rH 

o 

J^- 

CM 

to 

CO 

CO 

00 

CO 

<35 

to 

o 

o 

CO 

CO 

to 

cp 

00 

CM 

1> 

rH 

rH 

CO 

CM 

t- 

to 

PP 

b 

3 

1—1 

CM 

iH 

b 

b 

b 

O 

*lt 

>— s 

X*— N 

•fo 

W 

CO 

CO 

CM 

CM 

rH 

CO 

CO 

00 

to 

CM 

o 

T— 1 

I— H 

CO 

00 

1-H 

CO 

00 

CO 

to 

J> 

i-H 

g 

I-] 

00 

1— 1 

CO 

CO 

00 

o 

CM 

to 

OO 

CO 

CO 

e 

5J 

CO 

CO 

to 

to 

<35 

p 

1 — 1 

rH 

rH 

i-H 

CO 

b 

b 

3 

b 

iH 

b 

3 

b 

b 

C3 

lO 

o 

to 

to 

o 

to 

CS| 

to 

!>. 

CM 

to 

]> 

b 

b 

b 

b 

b 

b 

II 

II 

11 

II 

II 

II 

C5< 

cy 

<31 

<31 

O 

O 

CM 

CO 

II 

II 

39 


of  the  distribution  is  known.  This  brings  up  the  important  point  that  the  ML  estimation  has  the 
advantage  of  the  assumption  of  a  correctly  specified  model  in  this  comparison. 

Of  the  nonparametric  estimation  methods,  the  estimator  proposed  by  Foldes,  Rejto,  and 
Winter  is  the  best  with  the  PEXE  not  far  behind  according  to  the  MISE  criterion.  The  KME  is 
the  next  best  and  the  easiest  to  compute  while  the  estimator  of  Blum  and  Susarla  is  the  worst, 
particularly  for  heavier  censoring. 

The  semi-parametric  methods  fall  between  the  nonparametric  and  parametric  methods  in 
terms  of  MISE.  They  do,  however,  have  the  slight  advantage  over  the  nonparametric  estimators  in 
this  study  because  the  Weibull  distribution  was  used  for  the  parametric  components,  which  hap¬ 
pened  to  be  the  distribution  used  in  this  comparison.  The  Klein,  Lee  and  Moeschberger  estimator, 
for  example,  behaves  more  and  more  like  the  ML  estimator  with  increasing  amounts  of  censoring. 
The  MISE  was  not  computed  for  the  semi  parametric  estimators  for  the  cases  using  the  3-parameter 
Weibull,  but  the  estimation  of  the  location  parameter  would  be  necessary  and  would  most  likely 
result  in  slight  increases  in  MISE. 

For  further  comparison,  the  Monte  Carlo  samples  of  1000 ISE  measures  were  obtained  for  each 
of  the  3  distributions  under  consideration  for  samples  of  size  20  and  60  at  each  level  of  expected 
censoring.  That  is,  the  10  estimation  techniques  were  compared  under  18  different  conditions. 
This  amounts  to  a  very  large  number  of  comparisons  when  taking  into  account  all  of  the  pairwise 
comparisons  as  well  as  comparisons  using  various  subsets  of  the  10  estimators  under  each  scenario. 
A  great  deal  of  these  subsets  and  pairwise  comparisons  were  examined  using  the  Kruskal- Wallis  test 
to  try  to  determine  which  estimators  could  be  considered  better  and  which  ones  could  be  considered 
as  equal  for  the  sample  sizes  and  proportions  of  censoring  considered  in  this  study.  The  comparison 
criterion  here  was  not  MISE  but  results  from  Kruskal- Wallis  tests.  The  Kruskal- Wallis  test  makes  no 
assumptions  on  the  underlying  distribution  of  the  ISE  while  the  ISE  comparison  via  the  two-number 
summary  of  mean  and  standard  deviation  implies  a  more  symmetric  underlying  distribution,  which 


40 


the  ISE  is  not.  Although  the  results  from  all  of  the  Kruskal- Wallis  tests  performed  are  not  printed 
in  this  document,  the  overall  summary  of  the  comparison  of  the  10  estimators  based  on  the  results 
of  many  Kruskal- Wallis  tests  and  many  side-by-side  boxplot  comparisons  of  ISE  is  as  follows.  When 
the  underlying  distribution  is  correctly  specified,  maximum  likelihood  is  significantly  better  than 
any  of  the  other  estimators  in  all  cases  examined  here.  In  most  cases,  especially  when  expected 
censoring  is  at  50%  or  75%,  the  kernel  estimator  of  Blum  and  Susarla  is  significantly  worse  than  all 
of  the  other  estimators.  The  reason  for  this  is  its  inability  to  extrapolate  beyond  the  last  observed 
failure  time  and  is  illustrated  in  figures  25,  26,  and  27  in  Appendix  C.  The  semi-parametric  Kaplan- 
Meier  estimator  and  the  partially  parametric  Klein,  Lee,  and  Moeschberger  estimator  performed 
competitively,  especially  the  KLME,  but  did  enjoy  the  advantage  of  having  a  correctly  specified 
parametric  part.  In  fact,  in  a  data  set  in  which  there  is  100%  censoring  the  KLME  is  identical  to 
the  MLE. 

Now,  when  comparing  the  strictly  nonparametric  estimators  two  of  them  stood  out  above  the 
rest:  the  piecewise  exponential  estimator  (PEXE)  and  the  kernel  estimator  proposed  by  Foldes, 
Rejto,  and  Winter  (FRWE).  The  PEXE  and  the  FRWE  performed  significantly  better  than  the 
three  Kaplan-Meier-based  estimators  and  the  mean  order  number  estimator  (MONE)  in  nearly  all 
of  the  comparisons.  Furthermore,  in  a  head-to-head  comparison  the  PEXE  and  the  FRWE  were 
statistically  equivalent  for  the  Weibull  distributions  at  sample  size  n  =  20  regardless  of  the  amount 
of  censoring.  However,  the  PEXE  had  significantly  lower  ISE  when  the  underlying  distribution 
was  exponential  whereas  the  FRWE  had  a  significantly  lower  ISE  than  the  PEXE  for  both  of 
the  Weibull  distributions  at  n  =  60  for  each  level  of  censoring.  In  general,  the  MONE  was  not 
significantly  different  from  the  three  Kaplan-Meier-based  estimators  with  the  exception  of  the  cases 
when  expected  censoring  was  75%,  where  it  was  significantly  worse.  The  three  Kaplan-Meier-based 
estimators  performed  similarly  under  most  of  the  conditions.  The  trigonometric  smoothing  offered 
no  significant  improvement  in  ISE  of  the  Kaplan-Meier  estimator  and  the  jackknifing  procedure 
actually  increased  ISE  at  25%  censoring  for  each  distribution  and  tended  to  have  higher  ISE  for 


41 


the  symmetric  distribution.  It  is  also  interesting  to  note  that  all  of  the  estimators  are  better  in 
the  sense  of  ISE  at  estimating  symmetric  distributions,  such  as  the  Weibull  with  shape  3.5,  than 
skewed  ones  like  the  exponential. 


42 


IIL  Goodness- of- Fit 


3 A  Literature  Review 

3AA  Tests  of  Simple  Hypothesis.  Many  of  the  procedures  found  in  the  literature  are  for 
testing  fit  to  a  completely  specified  distribution  function.  In  1976,  Koziol  and  Green  [81]  derived 
Cramer- von  Mises  type  statistics  using  the  Kaplan-Meier  product  limit  estimate  of  the  EDF  for 
randomly  censored  samples  in  which  the  censoring  survivor  function,  say  if,  depends  on  the  failure 
survivor  function,  F,  in  the  following  way:  if  =  where  &  is  a  censoring  parameter  between  0 
and  2.  The  proportion  of  censoring  is  given  by  ^  The  statistics  are  used  to  test  goodness- 

of-fit  to  a  completely  specified  distribution  function.  They  present  asymptotic  percentage  points 
as  well  as  a  power  study  against  certain  alternatives  at  various  levels  of  censoring.  Csdrgo  and 
Horvath  [31]  consider  using  the  Efron  transformation  [43]  to  convert  the  Koziol- Green  statistics 
into  one  with  a  different  asymptotic  distribution.  Hyde  [69]  used  martingale  theory  in  1977  to 
show  asymptotic  normality  of  statistics  used  for  testing  the  fit  of  a  randomly  censored  sample  to 
a  completely  specified  distribution  function  for  both  discrete  and  continuous  random  variables. 
In  1979  Hollander  and  Proschan  [66]  developed  a  test  statistic  to  test  whether  an  underlying 
distribution  subject  to  random  censoring  is  a  completely  specified  distribution  and  compared  it  to 
both  the  Koziol-Green  statistic  and  Hyde’s  procedure.  Ebrahimi  and  Habibullah  [42]  modified  the 
Hollander  and  Proschan  [66]  statistic  in  1992  to  enhance  its  effectiveness  for  instances  when  the 
failure  distribution  and  censoring  distribution  are  proportionally  related  (as  in  the  Koziol-Green 
model). 

Turnbull  and  Weiss  [136]  proposed  a  likelihood  ratio  statistic  in  1978  for  testing  goodness- 
of-fit  with  grouped  data  subject  to  random  right  censoring.  Similarly,  Gail  and  Ware  [51]  in  1979 
presented  a  statistic  for  comparing  randomly  right  censored  data  that  is  grouped  to  a  completely 
specified  distribution  function  or  a  known  comparison  curve  derived  from  actuarial  tables.  Another 
test  for  grouped  data  was  given  in  1984  by  O’Neill  [104].  Versions  of  Kolmogorov-Smirnov,  Kuiper, 


43 


and  Cramer- von  Mises  statistics  were  introduced  in  1980  by  Koziol  [80]  for  testing  goodness-of-fit 
with  randomly  censored  data  when  the  distribution  in  the  null  hypothesis  is  completely  speci¬ 
fied.  In  this  paper,  Koziol  provided  a  power  study  in  addition  to  Monte  Carlo  analysis  of  the 
adequacy  of  the  asymptotic  distributions  in  finite  samples.  Burke  [15]  derived  a  test  statistic  for 
testing  exponentiality  with  randomly  censored  samples  for  a  simple  hypothesis  in  1980  as  well.  The 
Kolmogorov-Smirnov  statistic  was  modified  by  Fleming,  O’Fallon,  O’Brien,  and  Harrington  [46] 
to  test  goodness-of-fit  of  a  completely  specified  distribution  in  the  presence  of  random  censoring. 
Hollander  and  Pena  in  1992  [65]  developed  a  chi-squared  goodness-of-fit  test  for  randomly  censored 
data. 


3,1,2  Tests  of  Composite  Hypothesis,  Relatively  few  goodness-of-fit  tests  have  been 
developed  for  randomly  right-censored  data  in  the  composite  hypothesis  case  when  parameters  are 
estimated.  Previously  published  tests  are  displayed  in  Table  11.  The  first  procedure  was  a  chi- 
squared  test  presented  by  J.  Chen  in  a  Ph.  D.  dissertation  for  Oregon  State  University  in  1975  [22]. 
Another  version  of  a  chi-squared  test  for  randomly  censored  data  was  developed  by  Habib  [58],  also 
in  a  1981  doctoral  dissertation  for  Oregon  State  University,  which  was  later  published  by  Habib 
and  Thomas  in  1986  [57].  Other  variations  of  the  chi-squared  test  for  the  composite  hypothesis 
with  randomly  censored  data  are  given  by  Akritas  in  1988  [3]  and  J.  H.  Kim  in  1993,  each  of 
which  can  also  be  used  to  test  simple  hypotheses  as  well.  C.  H.  Chen  [20,21]  derived  a  correlation- 
based  statistic  in  a  1982  dissertation,  a  generalization  of  the  Shapiro-Francia  statistic,  for  testing 
goodness-of-fit  with  randomly  censored  data  when  a  scale  parameter  or  both  location  and  scale 
parameters  are  unknown. 

In  1981  Nair  [103]  proposed  two  classes  of  nonparametric,  large-sample  tests  of  fit  based 
on  the  maximum  and  average  weighted  difference  between  the  specified  survival  function  and  the 
Kaplan-Meier  estimate  of  the  true  survival  function. 


44 


Table  11  Previous  Goodness-of-Fit  Tests  for  Randomly  Censored  Data  with  Composite  Hp. 


Year 

Author{s) 

Type 

Comments 

1975 

1980 

J.  Chen 

M.  Burke 

Chi-Squared 
Modified  EDF 

Applied  to  exponential  failure  with  exponential 

1982 

C.  H.  Chen 

Correlation 

censoring. 

Applied  to  exponential  failure  with  various 

1986 

Habib  &  Thomas 

Chi-Squared 

censoring  distributions  including  the  exponential. 
Derived  asymptotic  distribution  of  the 

1988 

1993 

M.  Akritas 

J.H.  Kim 

Chi- Squared 
Chi-Squared 

stochastic  process  Z{t)  =  ni[Fn{t)  —  Frit^Qn)] 

A  test  for  the  exponential  distribution  when  the  scale  parameter  is  estimated  was  derived  by 
Burke  in  1980  [15].  More  specifically,  Burke  considers  the  case  of  an  exponential  lifetime  distribution 
censored  randomly  on  the  right  by  an  exponentially  distributed  censoring  distribution.  Making  use 
of  the  Efron  transformation,  Burke’s  Cramer- von  Mises-type  test  statistic  is  constructed  in  such 
a  way  that  its  asymptotic  distribution  is  exactly  the  same  as  the  asymptotic  distribution  of  the 
Cramer- von  Mises  statistic  for  complete  samples  when  the  scale  parameter  is  estimated  regardless 
of  the  degree  of  censoring.  However,  simulations  indicate  that  the  small  sample  properties  of 
Burke’s  test  statistic  will  depend  on  the  level  of  censoring  as  well  as  sample  size  and  are  not  the 
same  as  the  Cramer- von  Mises  statistic  in  the  uncensored  case.  No  small  sample  studies  of  Burke’s 
statistic  have  been  published  and  so  consequently  no  small  sample  percentage  points  are  available 
for  goodness-of-fit  testing.  Percentage  points  were  generated  through  Monte  Carlo  simulation  for 
Burke’s  statistic  for  sample  sizes  20(20)100  and  proportions  of  censoring  0.20,  0.50,  and  0.80  and 
are  tabled  in  Section  3.5,2.  Monte  Carlo  samples  of  size  250,000  were  used. 


3.2  Asymptotic  Distributions  of  KME-Modified  Test  Statistics 

3.2.1  Literature  Review  of  General  Asymptotic  Theory.  The  function  Z{t)  =  y/n[Fnit)  — 
Frit)]  represents  a  stochastic  process  in  t  and  can  be  used  to  construct  a  variety  of  goodness-of-fit 
test  statistics,  including  the  Anderson-Darling  and  the  Cramer-von  Mises  statistics.  Efron  [43:  page 
843]  mentions  the  limiting  normality  of  this  stochastic  process,  namely  that  as  n  approaches  infinity. 


45 


Z{t)  approaches  a  Gaussian  stochastic  process  with  mean  0  and  covariance  function 


2(0)  =  [I  -  F.(.)l[l  -  Fx  W1  £  (14) 

for  s  <  t.  A  proof  of  this  convergence  is  given  in  B reslow  and  Crowley  [13]  though  Burke  contends 
that  the  proof  is  incomplete  [16].  Let  us  assume  now  that  we  are  dealing  only  with  positive  random 
variables.  Efron  [43],  using  a  transformation  from  Doob  [36],  states  that 

l-Fr(a-i(i)) 

is  a  stochastic  process  whose  limit  is  a  standard  Weiner  process  on  0  <  s<  t  <  T  where 
is  the  inverse  function  of 

r  dFriu) 

/o  [1-Fc(«)][1-Ft(u)]2 

and  T  is  any  value  such  that  Ft{T)  >  0.  A  standard  Weiner  process  is  a  Gaussian  process  with 
mean  0  and  cov{s,t)  =  min{s,t},  also  commonly  known  as  Brownian  motion. 

Looking  more  closely  at  Doob’s  transformation  [36]  affords  a  little  insight  into  the  formulation 
of  a{t)  and  The  purpose  of  the  transformation  is  to  convert  a  general  Gaussian  process 

into  a  more  manageable  Brownian  motion,  or  standard  Weiner,  process.  According  to  Doob  [36] 
this  can  be  achieved  if  the  covariance  function  of  the  Gaussian  process  has  the  form  cov{s,t)  = 
u(s)i;(i),  s  <  t,  for  s  and  t  in  some  interval  and  if  the  ratio  a(i)  =  is  continuous  and  monotone 
increasing  with  inverse  function  If  these  conditions  are  met,  then  the  process  ^[o“i(^jj  is  a 

standard  Weiner  process.  For  Efron’s  transformation,  let 

ui8)  =  FT{s)j^ 


46 


and  v{t)  =  1  -  Fr(0-  Thus, 


n(f\  =  ^= 

v{t)  Jo  [1-Fc(«)][1-Ft(u)P- 

Csorgo  and  Horvath  [31]  look  more  closely  at  the  Efron  transform  within  the  Koziol-Green 
(proportional  hazards)  model  of  random  censorship  for  a  simple  hypothesis.  Under  this  model, 
they  show  that  a~^{t)  can  be  estimated  by 

where  is  the  inverse  of  the  CDF  specified  by  the  null  hypothesis  and  p  —  ^  is  the  observed 
proportion  of  failures  in  the  sample.  Csorgo  and  Horvath’s  Efron-transformed  process  [31]  is  given 
by 

l-jF^[anHt)] 

and  corresponding  Cramer- von  Mises  statistic  for  lifetime  distributions  bounded  on  the  left  by  zero, 
as  many  lifetime  distributons  are,  may  be  written  as 

r[Y^it)?dt 

Jo 

where  T  is  chosen  so  that  a~^{T)  =  Now,  as  a  result  of  the  Efron  transform,  Y^{t)  is  a 

Brownian  motion  process  asymptotically  and  by  the  scale  transformation  of  Brownian  motion  [31] 

ul{0,T)  =  T^u;l{0A) 


47 


in  distribution.  This  scale  transformation  identifies  the  process  as  the  integral  of  the  square  of  a 
Brownian  Bridge  process,  which  was  derived  by  Cameron  and  Martin  in  1944  [31]  and  is  tabled  in 
Csorgo  and  Horvath  [31].  The  Brownian  Bridge  process  is  Brownian  motion  on  the  interval  (0,1). 

So  far  we  have  only  discussed  the  asymptotic  distribution  of  test  statistics  for  simple  hypothe¬ 
ses.  In  1986,  Habib  and  Thomas  [57]  derived  the  limiting  distribution  of  the  stochastic  process 

Z„{t)  =  n^F„{t)  -  Frit, On)] 


where  ^  is  a  vector  of  k  parameters  to  be  estimated  by  maximum  likelihood  to  be  used  for  goodness- 
of-fit  with  composite  hypotheses.  They  [57]  determined  the  limiting  distribution  to  be  Gaussian 
with  zero  mean  and  covariance  given  by 


cov[Z{s),  Z{t)]  -  cov[Z{s),  Z{t)]  -  J-^ 


dO 


do 


(15) 


where  T  represents  transpose  and  J  denotes  the  information  matrix  whose  elements  are  given  by 


h-  I 


~--~-^^^Fc{t)fTmdt  -  I  ^l^l^^FTit-,0)fc{t)dt  (16) 


dOiOe 


for  ij  =  1,...,A;. 

3.2.2  Problems  with  the  Asymptotic  Distribution  of  Test  Statistics  in  Tests  of  Exponentiality 
with  Exponential  Censoring.  For  the  composite  test  of  hypothesis,  consider  an  exponential 
lifetime  distribution  with  mean  6  subject  to  an  exponential  censoring  distribution,  which  fits  the 
proportional  hazards  criterion  of  the  Koziol-Green  model  of  random  censorship.  Let  us  also  use 
the  notation  of  Koziol  and  Green  [81]  to  denote  censoring  parameter  6,  which  is  related  to  the 
proportion  of  uncensored  observations  by  the  relation  b  =  ^  and  is  easily  estimated  from  the 
data.  The  asymptotic  distribution  of  Zn{t)  should  be  Gaussian  with  zero  mean  and  covariance 


48 


function 


cov[Z{s),Z{i)] 


£  (6^  ■ 

e 


1 +  (6+1)2 
(&+1) 


exp 


for  0  <  s  <t  and  &  >  0  ,  constructed  using  the  methods  given  by  Habib  and  Thomas. 

Observe  that  the  covariance  function  of  the  resulting  Gaussian  process  can  be  factored  into 
an  expression  in  terms  of  t  multiplied  by  an  expression  depending  only  on  s.  That  is,  let 


v{t)  = 


and 


tt(s) 


1  r  s .  (&+1)  1 


l  +  (6  +  l)2 
(6  +  1) 


exp 


so  cov{Z{s)^Z{t))  =  u(s)u(^).  Now,  for  the  Doob/Efron  transformation  to  Brownian  motion,  we 
need 


a(t) 


u{t) 

v{t) 

1 

ib+l) 

1 

(&+1) 


t 


1 +  (6+1)2 
(6  +  1) 

-l-(6  +  l)2|. 


Further  noting  that  the  proportion  of  uncensored  observations  is  p  =  we  have 


o>n{t)  =  pexp 


p6 


(17) 


and  thus, 

a-\t)=pein(j  +  l  +  ^'^ 


(18) 


49 


where  p  =  When  following  this  methodology  to  construct  a  goodness-of-fit  test  statistic  whose 
asymptotic  distribution  is  known,  namely  the  integral  of  a  squared  Brownian  Bridge  in  this  case, 
the  estimation  of  parameters  interferes  with  the  transformation.  One  problem  is  that  >  0. 

In  other  words,  the  transformation  does  not  return  time  t  to  0.  This  fact,  in  and  of  itself,  is  not  a 
problem.  However,  a  severe  problem  is  that  it  is  possible  to  get  which  means  that 

=  0  for  t  >  0  and  so  the  Csorgo  and  Horvath  test  statistic  a;^(0,  T)  depends  exclusively 
on  the  sample  size.  In  other  words,  the  test  statistic  does  not  contain  enough  information  to 
acknowledge  or  rule  out  a  hypothesized  distribution  and,  furthermore,  would  be  a  constant  for  each 
given  sample  size  regardless  of  the  underlying  or  hypothesised  distributions.  In  some  instances  it  is 
also  possible  to  get  >  T,  which  is  the  upper  limit  of  integration  in  the  integral.  As  a  result, 

there  is  no  guarantee  that  the  general  Gaussian  stochastic  process  Zn{t)  can  be  transformed  into 
the  Brownian  motion  process. 

3,2,3  Problems  with  the  Asymptotic  Distribution  of  Test  Statistics  when  Testing  for  the 
Weibull  Distribution  Within  the  Proportional  Hazards  Model  of  Random  Censorship,  Consider  a 
goodness-of-fit  test  with  a  composite  hypothesis  in  which  the  lifetime  distribution  is  a  3-parameter 
Weibull  distribution  with  known  shape  parameter.  Further  assume  that  the  censoring  model  is  the 
proportional  hazards  model  of  random  censorship.  That  is,  we  have  a  lifetime  distribution  with 
distribution  function 

FT{t;'y,ri)  =  1 

and  a  censoring  distribution  with  distribution  function 


=  1 -e 

where  t  >  j,'q  >  0, P  >  0,b  >  0  is  the  censoring  parameter  [81],  and  we  assume  that  P  is  known. 
Applying  Equations  14,  15,  and  16  yields  a  covariance  function  for  the  zero  mean  Gaussian  process 


50 


Z{t)  =  ni[Fn{t)  -  Frit, On)]  given  by 


cov{Z{s),  z{t)) 


(s  -  -  7)'^~^e~(  V)^-(V)'^ 


r(i  -  f ) 

M-i 


r2(i-i) 


+ 


/3(6+ir^r(i-i) 


.20-1 


(0-1) 


(s  +  t-2y)  + 


(&+i)r(i 

„2/? 


■1^(5 -7)(<- 


7) 


when  using  the  methods  of  Habib  and  Thomas.  A  regularity  condition  imposed  by  this  function  is 
that  the  shape  parameter  j3  must  be  greater  than  2.  Another  result  that  surfaces  is  that,  although 
this  model  has  a  censoring  distribution  with  a  hazard  function  that  is  proportional  to  the  failure 
distribution,  the  covariance  function  takes  on  a  form  that  cannot  be  factored  as  cov(Z(s),  Z(t))  = 
u(s)v(t),  thus  preventing  the  transformation  to  Brownian  motion  via  the  Doob/Efron  transform. 


3.8.4  Remarks  on  Asymptotic  Distributions  of  Goodness- of -Fit  Statistics.  When  data  is 
randomly  censored,  distribution  function  estimators,  parameter  estimators,  and  goodness-of-fit  test 
statistics  converge  slower.  For  example,  while  the  EDF  for  an  uncensored  sample  converges  at  a  rate 
on  the  order  of  ni (logn)~^,  the  KME  converges  at  a  rate  on  the  order  of  ni (logn)"2  [48]  and  the 
kernel  estimator  given  by  Foldes,  Rejto,  and  Winter  converges  at  a  rate  on  the  order  of  (logn)“i 
[49].  In  the  construction  of  confidence  bands  for  the  Kaplan-Meier  estimator  it  is  stated  [16,31] 
that  a  sample  size  of  at  least  n  =  81  x  10^  is  required  for  89%  confidence  using  the  methods  of 
Gillespie  and  Fisher  [55].  Csorgd  and  Horvath,  however,  were  able  to  develop  a  methodology  for 
constructing  such  confidence  bands  which  they  say  require  samples  of  size  n  =  35, 000,  which  is 
still  quite  large  [31]. 

It  also  seems  intuitive  that  the  greater  the  amount  of  censoring,  the  slower  the  convergence. 
The  following  passage  from  Silverman  [120:  pages  73-74]  was  in  reference  to  the  research  of  Bickel 
and  Rosenblatt  [10]  on  the  asymptotic  behavior  of  kernel  density  estimators  for  uncensored  data 
but  also  has  relevance  here. 

This  discussion  is  not  intended  to  belittle  their  remarkable  mathematical  achievement. 

The  authors  themselves  point  out  that  ‘these  asymptotic  calculations  are  to  be  taken 


51 


with  a  grain  of  salt’  and  Rosenblatt  warns  in  his  survey(1971,  p.l818)  against  the  over¬ 
literal  interpretation  of  asymptotic  results.  Nevertheless,  he  goes  on  to  say,  asymptotic 
theorems  are  useful  if  treated  with  care.  For  example,  they  can  be  used  as  a  starting- 
point  for  simulation  studies  or  simulation-based  procedures,  and  they  may  help  to  give 
an  intuitive  feel  for  the  way  that  a  method  will  behave  in  practice. 

Currently  there  are  goodness-of-fit  tests  for  randomly  censored  data  for  which  the  asymptotic 
behavior  of  percentage  points  is  established.  However,  no  tables  of  percentage  points  have  been 
published  for  small  samples.  Since  such  tables  can  only  be  constructed  through  Monte  Carlo 
simulation,  this  is  one  area  where  the  applied  statistician  can  make  a  difference. 

3.3  Some  Justification  for  the  Assumption  of  an  Exponentially  Distributed  Censoring  Variable 

Because  of  complex  dependencies  of  the  distributions  of  goodness-of-fit  statistics  on  the  cen¬ 
soring  distribution,  it  is  somewhat  necessary  to  assume  some  kind  of  model  for  censoring.  For 
distributions  which  are  bounded  on  the  left,  consider  a  censoring  distribution  that  is  negative  expo¬ 
nential  with  the  same  location  parameter  as  the  lifetime  distribution.  Many  distributions  commonly 
used  in  reliability  theory  share  this  characteristic,  such  as  the  Weibull,  Gamma,  and  lognormal. 
Some  justification  for  this  assumption  may  be  afforded  by  a  1960  paper  by  R.  F.  Drenick  [37],  who, 
generalizing  the  work  of  Palm  [110],  provides  proof  that  systems  composed  of  many  components 
in  a  series  arrangement  tend  to  have  exponentially  distributed  lifetimes  under  reasonably  general 
conditions  as  the  complexity  of  the  system  and  the  operation  time  increase.  In  order  for  Drenick’s 
Theorem  to  have  some  applicability  in  this  random  censoring  model,  we  must  assume  that  all  modes 
of  censoring  are  independent,  any  means  of  censoring  will  remove  a  subject  or  item  from  the  test, 
and  each  mode  of  censoring  always  exists.  With  these  assumptions,  the  censoring  mechanisms  for 
a  randomly  censored  sample  may  be  viewed  as  a  complex  series  system  since  there  may  be  several 
modes  of  censorship  for  each  subject  or  item  on  test,  any  one  of  which  will  result  in  withdrawal 
from  the  test.  Hence,  the  assumption  of  an  exponentially  distributed  censoring  variable  may  be 
reasonable  and  somewhat  robust. 


52 


34  Modified  EDF  Statistics  for  New  Goodness-of-Fit  Tests  for  Randomly  Censored  Data 

Goodness-of  fit  statistics  based  on  the  empirical  distribution  function,  or  EDF  statistics,  are 
known  to  be  generally  superior  to  other  classes  of  goodness-of-fit  tests  in  terms  of  power  when  it 
comes  to  detecting  departures  from  a  variety  of  continuous  univariate  distributions  [32,123].  The 
most  successful  of  which  is  the  Anderson-Darling  statistic  followed  closely  by  the  Cramer- von  Mises 
statistic.  It  seems  reasonable  to  assume  that  they  may  enjoy  the  same  status  among  goodness-of-fit 
test  statistics  for  randomly  censored  samples, 

5.^.i  Computing  Formulas.  For  complete  samples  under  a  simple  hypothesis,  the  modified 
Cramer- von  Mises  and  Anderson-Darling  statistics  were  defined  by  Equations  1  and  2,  respectively, 
in  Chapter  II.  In  the  case  of  the  composite  hypothesis  Fo(a:)  may  be  replaced  by  its  maximum 
likelihood  estimate.  Further  in  Chapter  II  it  is  shown  that  these  statistics  may  be  modified  by  using 
the  Kaplan-Meier  estimator  in  place  of  the  EDF  and  employed  as  the  measure  that  is  minimized 
for  distance  estimation  when  samples  are  randomly  right-censored.  Although  these  test  statistics 
work  well  as  distance  measures  within  the  context  of  distance  estimation,  a  problem  arises  when 
they  are  used  for  goodness-of-fit  testing. 

The  problem,  which  is  prevalent  primarily  when  data  are  subject  to  censoring  levels  above 
50%,  stems  from  the  fact  that  estimators  of  the  EDF  do  a  very  poor  job  after  the  last  observation 
in  the  sample.  For  highly  censored  data,  this  will  often  yield  a  higher  test  statistic  for  the  correct 
distribution  than  for  some  alternative  distribution  in  a  goodness-of-fit  test.  Figure  2  is  helpful  in 
seeing  how  the  KME,  for  example,  is  fairly  accurate  for  a  correctly  specified  null  distribution  up  to 
the  last  observation,  which  is  circled  in  the  figure,  but  then  is  quite  far  off  because  it  takes  on  the 
value  1  beyond  Now,  when  the  null  distribution  is  incorrect  the  KME,  represented  by  the 

dash-dot  line  in  Figure  2,  is  not  as  good  from  0  to  X(„)  but  does  well  enough  from  X(^)  to  -j-oo 
to  more  than  make  up  for  the  discrepancies  that  are  measured  from  0  to  resulting  in  a  lower 
test  statistic  value  than  what  is  computed  when  the  null  distribuion  is  correct.  The  effect  of  this 


53 


Exponential  Distribution  with  90%  Random  (Exponential)  Censoring 


Figure  2  Illustration  of  Upper  Integration  Limit  for  the  KME-Modified  Cramer- von  Mises  Statis¬ 
tic. 

poor  ability  to  estimate  beyond  is  that  for  highly  censored  samples  it  would  be  more  likely 
to  reject  a  true  null  hypothesis  than  a  false  one.  This  problem  is  corrected  simply  by  replacing 
integration  limit  of  -j-oo  with  when  computing  the  modified  goodness-of-fit  statistics.  Upon 
making  this  slight  change,  the  resulting  computing  formulas  for  the  KME-modified  Cramer-von 
Mises  and  Anderson-Darling  test  statistics  are 


j-1 


.))f  Wi)  -  f'g-i))  +  iK)  -  (18) 


and 

r 

3=1  (20) 

-  [F„(a)(,.))  -  l]2[log(l  -  [/(,•))  -  log(l  -  [/(,•_!))]  -  iUa)  -  t/(, •-!))} 


respectively,  where  F„(Jf(o))  =  Uo  =  0. 


54 


A  Condition  for  Location  and  Scale  Invariance.  The  maximum  likelihood  estimators 
for  randomly  censored  data  are  location  and  scale  invariant  provided  the  censoring  distribution  has 
the  same  location  parameter  as  the  lifetime  distribution  and  makes  the  same  location  and  scale 
changes  as  the  lifetime  distribution.  The  KME  is  always  location  invariant  when  the  censoring 
distribution  makes  the  same  location  and  scale  changes  as  the  lifetime  distribution  but  is  only 
scale  invariant  under  these  transformations  when  the  location  parameters  are  equal.  This  can  be 
seen  empirically  by  using  the  same  set  of  uniform  random  numbers  to  generate  randomly  censored 
samples  from  distribution  pairs  with  location  and/or  scale  changes.  To  get  an  identical  product- 
limit  estimate  it  is  necessary  to  have  the  exact  same  series  of  zeros  and  ones  for  the  covariate 
indicator  function  Si,  which  only  occurs  when  the  location  parameters  are  equal  and  the  location 
and  the  scale  of  the  censoring  distribution  undergo  the  same  transformations  as  the  underlying 
failure  distribution. 

Some  analytical  underpinnings  for  this  are  shown  in  the  following  argument.  Let  Uj,j  = 
be  a  sequence  of  random  numbers  from  the  uniform  distribution  on  [0,1].  Then  ran¬ 
dom  deviates  from  failure,  Ft  censoring,  Fc  distributions  with  location  and 

scale  parameters  7^  and  0^  >  0,4  =  1,2,  respectively,  may  be  generated  using  probability  integral 
transformations  of  the  form 

ti  =  +  7i>  i  =  = 

a  =  e2Fc^{uj)  +  72,  i  =  l,...,n,j  =  n  +  l,...,2n. 

The  observed  data  pair  then  consists  of  Xi  =  xma{ti,Ci}  and  (ij  =  ^[ti<cj]  in  the  competing  risks 
model  of  random  censorship.  Now,  suppose  we  let  s  >  0  denote  a  change  in  scale  and  h  denote  a 
location  shift.  This  would  change  the  probability  integral  transformations  in  the  following  way 

tf  =  s6iF^^{uj)  +7i  +  /i,  i  =  =  l,...,n 

c[  =  s62F^^{uj)  +  72  +  h,  i  =  I, . . .  ,n,  j  =  n  +  1, . . .  ,2n. 


55 


Suppose  ti  <  Ci,  which  impies  that  Si  =  1.  Then  we  have 


OiFj,^{uj)  +  7i  <  e^Fc^iuj)  +  72- 


Now  if  Si  is  to  remain  at  1  after  the  location  and  scale  shift  we  need  or 

s6iF:^^{uj)  +  7i  H-  ^(wj)  +  72  +  h. 


Furthermore,  it  can  be  seen  from  this  that  if  71  =  72  then  we  can  be  assured  that 


(uj)  <  62Fq^ {uj) 


since  s  >  0.  However,  if  71  ^72,  then  there  is  no  guarantee  that  <  c[  and  many  counterexamples 
can  easily  be  constructed. 

Since  the  modified  Cramer- von  Mises  and  Anderson-Darling  goodness-of-fit  statistics  given  in 
this  dissertation  are  based  on  maximum  likelihood  and  Kaplan-Meier  estimates,  which  are  location 
and  scale  invariant  under  the  condition  that  the  location  parameters  are  equal,  then  it  follows  that 
they  are  location  and  scale  invariant  under  this  condition  as  well.  It  should  also  be  noted  that  the 
minimum  distance  estimation  of  location  parameters  is  also  location  invariant. 

S.5  New  Goodness-of-Fit  Tests  for  Exponential  Lifetimes  with  Exponentially  Distributed  Random 

Right  Censoring  for  a  Composite  Hypothesis 

3.5.1  New  Tests  Based  on  KME-Modified  Cramer-von  Mises  and  Anderson-Darling  Test 
Statistics.  Monte  Carlo  samples  of  size  250.000  were  used  to  find  percentage  points  for  sample 
sizes  20(20)200  and  proportions  of  censoring  0.10(0.10)0.90  for  the  two  new  goodness-of-fit  tests 
for  exponential  lifetimes  subject  to  exponentially  distributed  random  right-censoring  based  on  the 


56 


modified  Cramer-von  Mises  and  Anderson-Darling  statistics  and  A^  ,^.  The  statistics  are 
modified  with  the  natural  choice  of  the  Kaplan-Meier  estimator  replacing  the  EDF.  These  tests  are 
based  on  the  competing  risks  model  of  random  censorship  and  the  assumption  that  the  failure  and 
censoring  distributions  are  independent  and  both  exponentially  distributed.  The  null  hypothesis  is 
Ho  :  Frit)  is  exponential.  The  procedure  for  testing  this  hypothesis  is  as  follows: 

1.  Estimate  A  using  the  maximum  likelihood  estimator  A  =  A  xi  where  r  =  Yh-i 

2.  Construct  the  KME  as  defined  in  Section  2.2.1. 

3.  Calculate  the  test  statistic,  either  or  using  Equation  19  or  20. 

4.  Estimate  the  proportion  of  censoring  as  ^  =  1  —  . 

5-  Enter  the  appropriate  table  of  percentage  points  for  the  nearest  n  and  q  at  the  desired  a  level 
and  compare  the  test  statistic  to  the  corresponding  percentage  point. 

6.  Reject  Ho  if  the  test  statistic  is  greater  than  the  percentage  point  from  the  table,  otherwise 
do  not  reject  the  hypothesis  of  exponentiality. 

Tables  of  percentage  points  are  given  in  Appendix  D. 


57 


3,5.2  Burke  Test  for  Exponentiality  for  a  Composite  Hypothesis.  For  Burke’s  test  of 
exponentiality,  we  assume  that  the  failure  and  censoring  random  variables  are  independent  and 
exponentially  distributed  with  Frit)  =  1  —  e"^/^  and  Fc{c)  =  1  -  e“^/^.  Though  Burke  does  not 
explicitly  give  his  test  statistic  a  name,  we  will  call  it  B^.  It  can  be  expressed  as 


(21) 


where  M„  =  xelogn  for  some  e  >  0,  x  =  ^  X)"=i  ^  ^  I  ^"=1  and  0  =  ^  E"=i  ^i- 

Simpson’s  rule  with  101  function  evaluations  was  the  numerical  integration  technique  used  and  e  =  1 
was  selected  for  the  construction  of  the  table  of  small  sample  percentage  points.  See  Appendix  A 
for  a  description  of  Simpson’s  rule.  Monte  Carlo  samples  of  size  250,000  were  used  to  construct 
Table  12. 

The  null  hypothesis  is  Hq  :  Frit)  is  exponential.  The  procedure  for  testing  this  hypothesis  is 
as  follows: 


1.  Estimate  A  using  the  maximum  likelihood  estimator  A  =  “  SlLi 

2.  Estimate  0  using  the  estimator  0  ~  SlLi 

3.  Construct  the  KME  as  defined  in  Section  2.2.1. 

4.  Calculate  the  test  statistic  B'^. 

5.  Estimate  the  proportion  of  censoring  as  g  =  1  — 

6.  Enter  the  appropriate  table  of  percentage  points  for  the  nearest  n  and  q  at  the  desired  a  level 
and  compare  the  test  statistic  to  the  corresponding  percentage  point. 

7.  Reject  Ho  if  the  test  statistic  is  greater  than  the  percentage  point  from  the  table,  otherwise 
do  not  reject  the  hypothesis  of  exponentiality. 


58 


Table  12 


Percentage  Points  of  for  the  Exponential  Distribution  with  Exponential  Censoring. 


n 

0.25 

0.20 

0.15 

a 

0.10 

0.05 

0.025 

5  =  .20 

20 

0.182 

0.205 

0.236 

0.279 

0.360 

0.446 

40 

0.160 

0.180 

0.206 

0.245 

0.314 

0.387 

60 

0.152 

0.171 

0.196 

0.232 

0.297 

0.367 

80 

0.148 

0.166 

0.190 

0.225 

0.289 

0.355 

100 

0.146 

0.164 

0.187 

0.222 

0.284 

0.350 

q  =  .50 

20 

0.199 

0.229 

0.270 

0.332 

0.450 

0.578 

40 

0.178 

0.204 

0.238 

0.289 

0.386 

0.491 

60 

0.170 

0.195 

0.227 

0.276 

0.365 

0.460 

80 

0.166 

0.190 

0.221 

0.268 

0.352 

0.444 

100 

0.165 

0.188 

0.218 

0.264 

0.347 

0.437 

11 

Oo 

o 

20 

0.184 

0.224 

0.307 

0.430 

0.682 

1.045 

40 

0.164 

0.201 

0.252 

0.356 

0.539 

0.764 

60 

0.161 

0.189 

0.234 

0.308 

0.471 

0.660 

80 

0.160 

0.187 

0.228 

0.294 

0.437 

0.605 

100 

0.161 

0.188 

0.226 

0.286 

0.415 

0.574 

3.6  New  Goodness-of-Fit  Tests  for  the  Weihull  with  Exponentially  Distributed  Random  Right- 

Censoring  for  a  Composite  Hypothesis 

3.6.1  New  Tests  Based  on  Modified  Cramer-von  Mises  and  Anderson-Darling  Statistics  for 
Unknown  Location  and  Scale.  Monte  Carlo  simulation  used  to  find  percentage  points  for  sample 
sizes  20(20)100  and  proportions  of  censoring  0.20,  0.50,  and  0.80  for  the  two  new  goodness-of-fit 
tests  for  Weibull  lifetimes  with  known  shape  parameter  values  ^  =  2  and  /3  =  3.5  subject  to 
exponentially  distributed  random  right-censoring  based  on  the  Cramer-von  Mises  and  Anderson- 
Darling  statistics  where  the  KME  is  substituted  for  the  EDF.  Monte  Carlo  samples  of  size  250,000 
were  used  to  find  the  percentage  points,  which  are  tabled  in  Appendix  E.  These  tests  are  based  on 
the  competing  risks  model  of  random  censorship  and  the  assumption  that  the  failure  and  censoring 
distributions  are  independent.  The  failure  distribuition  is  assumed  to  be  Weibull  and  the  censoring 
distribution  is  exponential  and  they  are  assumed  to  be  statistically  independent.  That  is,  we  have 
a  random  sample  of  pairs  (a;i,  ),...,  (a^n,  ^n)  where  xi  =  min{ti,Ci}  and  Si  =  I[ti<ci]-  The  null 

hypothesis  is  Hq  :  Ft  (t)  is  Weibull  with  known  shape  parameter  /?.  The  procedure  for  testing  this 
hypothesis  is  as  follows: 


59 


1.  Estimate  the  Weibull  location  via  minimum  distance  estimation  and  scale  parameter  using 
maximum  likelihood  estimation  as  in  Section  2.1.3  only  with  the  shape  parameter  known. 

2.  Construct  the  KME  as  defined  in  Section  2.2.1. 

3.  Calculate  the  test  statistic,  either  or 

4.  Estimate  the  proportion  of  censoring  as  ^  =  1  —  . 

5.  Enter  the  table  of  percentage  points  for  the  Weibull  with  shape  /3  under  the  nearest  n  and  q 
at  the  desired  a  level  and  compare  the  test  statistic  to  the  corresponding  percentage  point. 

6.  Reject  Ho  if  the  test  statistic  is  greater  than  the  percentage  point  from  the  table,  otherwise 
do  not  reject  the  hypothesis  that  the  failure  process  is  Weibull  with  shape  /S. 

Tables  of  percentage  points  to  be  used  with  these  test  are  given  in  Appendix  E. 

3.7  New  Goodness-of-Fit  Tests  Based  on  Crude  Lifetimes 

The  competing  risks  model  of  random  censorship  is  widely  used  because  it  makes  sense  to 
view  the  underlying  failure  process  as  one  risk  and  the  combination  of  every  possible  reason  why 
an  item  may  be  withdrawn  from  a  test  as  the  other  risk.  In  competing  risks  theory,  a  net  lifetime 
represents  the  lifetime  of  an  item  when  it  is  subject  to  one  of  the  specified  risks  while  no  other  risks 
are  present  and  a  crude  lifetime  represents  the  lifetime  of  an  item  subject  to  one  of  the  specified 
risks  when  all  risks  are  present  [87:  p.  110].  Our  purpose  is  to  characterize  the  distribution  of  the 
net  lifetime  for  the  risk  of  interest,  which  is  our  underlying  failure  process. 

Let  T  represent  the  net  lifetime  of  the  variable  of  interest  and  let  C  characterize  the  net 
lifetime  of  a  random  variable  that  censors  the  observed  life  of  T.  In  this  context  the  observed 
lifetime  of  an  item  is  X  =  min{r,  C]  as  well  as  the  covariate  indicator  8.  By  convention,  if  the 
item  is  observed  until  failure,  then  T  <  C,  X  =  T,  and  8  =  1.  If  an  item  is  withdrawn  from  testing 
before  it  fails  or  fails  from  any  cause  other  than  the  cause  of  interest,  then  T  >  (7,  X  =  (7,  and 


60 


J  =  0.  We  define  the  crude  lifetimes  as  conditional  lifetimes  which  depend  on  whether  or  not  the 
observed  life  of  an  item  on  test  represents  a  failure  time  or  a  withdrawal  time.  That  is,  let  Yt 
be  the  conditional  life  of  X  given  that  X  =  T  and  let  Yc  be  the  conditional  life  of  X  given  that 
X  =  C.  E  we  assume  T  and  C  to  be  independent,  then 

Sxix)  =  P[X>x] 

=  P[T>x]P[C>x] 

=  STix)Scix).  (22) 


Furthermore,  by  conditioning, 

5x(a:)  =  P{X>x] 

=  P[X  =  T]P[X  >x\X  =  T]  +  P[X  =  C]P[X  >x\X  =  C\ 

=  P[X  =  T]P[Yt  >x]  +  P[X  =  C]P[Yc  >  x] 

=  pSyt  (x)  +  (1  -  p)Syc  (a:)  (23) 

where  p  =  P[X  =  T]  =  P[T  <  C],  Thus,  from  Equations  22  and  23  we  have  the  following 
relationship  between  the  survivor  functions  of  the  observed,  net,  and  crude  lifetimes 

Sx{x)  -  St{x)Sc{x)  =pSyA^)  +  {l-p)SYa{xy  (24) 

This  shows  that  the  observed  lifetime  X  can  be  modeled  as  both  the  minimum  of  competing  net 
lifetimes  as  well  as  a  mixture  of  crude  lifetimes. 

In  practice,  this  means  that  from  the  observed  random  sample  of  Xi  =  min{ii,  c^},  i  =  1, . . . ,  n, 
with  indicator  Si  ==  the  set  of  failure  times  are  the  observed  failures  from  a 

randomly  censored  sample  of  net  lifetimes  with  distribution  function  Frit)  and  can  also  be  taken 


61 


as  a  complete  sample  of  crude  lifetimes  with  distribution  function  Furthermore,  the  set  of 

withdrawal  times  C(i), . . .  ,C(n-r)  is  a  randomly  censored  sample  of  net  lifetimes  with  distribution 
function  Fc{c)  and  can  also  be  taken  as  a  complete  sample  of  crude  lifetimes  with  distribution  func¬ 
tion  Fy^(c).  The  randomly  censored  sample  can  now  be  treated  as  a  mixture  of  two  distributions 
where  the  population  from  which  each  observation  is  drawn  is  known  and  the  mixing  parameter  p 
is  easily  estimated  by  p  =  Now,  since  the  observed  crude  lifetimes  are  considered  to  be  complete 
samples,  we  can  use  complete  sample  goodness-of-fit  procedures  to  determine  the  distributions  of 
the  crude  lifetimes  of  both  the  failure  and  the  censoring  variables.  The  incentive  to  fit  a  distribu¬ 
tion  to  the  crude  lifetime  of  the  censoring  variable  arises  through  the  following  relationship  between 
crude  lives  and  net  lives.  When  net  lifetimes  are  independent,  the  hazard  function  of  the  net  life 
of  the  random  variable  T,  our  variable  of  interest,  is 


hrix)  = 


_ P/vtW _ 

pSy^ix)  +  {1  ~  p)Sy^{x)' 


(25) 


A  proof  of  this  result  is  given  in  [87:  pp.  288-290].  An  alternative  proof  is  given  here  as  follows. 
Since  Yr  is  the  conditional  life  of  X  given  that  X  =  T  and  T  and  C  are  independent,  the  density 
function  of  Yt  can  be  expressed  as 


Iyt  (^) 


P[x  <  Yt  <  a;  -f  Ax] 

_  ~ — - i 

Aa;->0  Ax 

P[x  <  X  <  X  -h  Ax  I  X  =  T] 

Aar-+0  Ax 

P[x<T  <x  +  Ax,X  =  T\ 
Az“o  AxP[X  =  T] 

P[x  <x  +  Aa;]P[<7  >  a;] 
A^o  AxP[T  <  C] 

fT{x)Sc(x) 

P 


(26) 


62 


Solving  Equation  26  for  /t(®)  yields 


Mx)  = 


PfrAx) 

Sc{x) 


which  can  then  be  written  as 


As  a  result,  we  have 


hT{x)ST{x) 


pfvA^) 

Sc{x)  • 


hrix) 


P/Yrix) 

ST{x)Scix) 

_ P/Yrix) _ 

p5y^(a;)  +  (1  -p)SyAx)' 


To  further  examine  the  intricate  relationships  between  net  and  crude  lifetimes,  consider  the 
survivor  functions  of  the  crude  lifetimes  expressed  in  terms  of  the  net  lifetimes.  We  have 


syA^)  =  p[yT>x] 

=  P[X>x\X  =  T] 

=  P[T>  x\T  <C] 

P[T>x,T<C] 

P[T<C] 

/x°°  [/t°°  fT{t)fc{c)dc]  dt 

P 

fT{t)Scm 

P 


SyAx)  =  P[Yc>x\ 

=  P[X>  x\X  ^C] 


63 


=  P[C>x\C<T] 

PlC>x,C<T] 

P[C  <  T] 

CUrfcic)Mt)dt]dc 

1-p 

CMc)ST{c)dc 
1  -p 


By  substituting  the  expressions  we  have  obtained  for  (x),  Syt  Syc  {^)  into  Equation  25, 

we  gain  insight  into  the  identity 


t,  ^  PfrAx) _ 

pSyA^)  +  {-^-p)Syo{x) 


p 

fT{3:)Sc{x) 

p 

'  r  fT(t)Scit)dt' 

v,a . . . . . 

P 

+  (i -p) 

'  n  fc(t)ST{t)dt' 
1-P 

_ fT{x)Sc{x) _ 

C  fTit)Scm  +  /“  fc{t)ST{t)dt 

_  _ hT{x)ST{x)Sc{x) _ 

[fiT{t)ST{t)Scit)  +  hG{t)ST{t)SGit)]dt 

_  _ hTix)STix)SG{x) _ 

+  hGit)]STit)SGit)dt 
_  hT{x)Sxix) 

hx{t)Sx{t)dt 
_  hT{x)Sx{x) 

fx{t)dt 

hTix)Sxix) 

Sx{x) 

=  hrix). 


The  goodness-of-fit  test  procedure  can  be  posed  as  a  simultaneous  hypothesis  test  on  the 
crude  lifetimes  of  the  failure  and  censoring  variables.  The  null  hypothesis  is 


Ho:  6{ff0,0  =  (<Ai,...,0fc)'€$} 


Fyc  e  {G^,i>  =  e 


64 


where  H  and  G  are  families  of  distributions  and  (j)  and  are  vectors  of  parameters  in  parameters 
spaces  $  and  $ .  The  procedure  for  testing  this  hypothesis  is  as  follows: 

1.  Estimate  phy  p  = 

2.  (a)  Perform  any  complete  sample  goodness-of-fit  test  on  the  failure  set. 

(b)  Perform  any  complete  sample  goodness-of-fit  test  on  the  withdrawal  set. 

3.  Use  significance  level  ^  to  achieve  an  overall  level  of  significance  of  at  most  a. 

4.  Reject  Hq  if  either  test  statistic  rejects  its  null  hypothesis,  otherwise  accept  Hq. 

5.  If  Ho  is  not  rejected,  find  the  hazard  function  and,  hence,  the  distribution  of  the  net  lifetime  of 

interest  by  substituting  p  for  p  and  using  the  maximum  likelihood  estimates  of  (^) 

and  Syc  (0  Equation  25. 

D’Agostino  and  Stephens  [32]  is  an  excellent  reference  for  goodness-of-fit  testing.  It  contains  pro¬ 
cedures  for  many  well  known  distributions  and  is  complete  with  tables  of  percentage  points  and 
computing  formulas  as  well  as  recommendations  for  which  tests  are  generally  more  powerful  for 
given  distributions. 

Many  models  of  the  net  lifetime  of  interest  can  be  constructed  using  combinations  of  popular 
lifetime  distributions  such  as  the  Weibull,  gamma,  lognormal,  inverse  Gaussian,  extreme  value,  and 
many  others  to  characterize  crude  lifetimes.  For  example,  if  both  of  the  crude  lifetimes  are  hy¬ 
pothesized  to  come  from  independent  2-parameter  Weibull  distributions  with  distribution  functions 
Fy^ri^)  =  1  --  7/,^  >  0,  and  Fycit)  =  1  —  /€  >  0,  and  the  null  hypothesis  is 

not  rejected,  then  the  hazard  function  of  the  net  lifetime  of  interest  would  have  the  form 

^  -f  (1  — 


65 


The  relationships  between  the  hazard  function  and  the  density,  survivor,  and  distribution  functions 
are  well  known.  Once  the  hazard  function  is  determined,  the  density  function  is  given  by 

/tW 

and  the  survivor  function  of  the  distribution  is 

and,  of  course,  the  distribution  function  is  Frit)  =  1  —  St (t)-  See  Appendix  B  for  details  of 
an  interesting  phenomenon  in  which  the  net  lifetime  of  interest,  T,  follows  a  split  population 
model  [118].  This  phenomenon  occurs  when  the  cumulative  hazard  function  of  the  crude  lifetime 
of  the  censoring  variable  is  less  than  the  cumulative  hazard  function  of  the  crude  lifetime  of  the 
variable  of  interest.  That  is,  when  Hycit)  <  HYri^)^ 

Although  this  resulting  functional  form  of  the  distribution  may  not  be  as  easy  to  manipulate 
as  other  distributional  forms,  a  wide  variety  of  distribution  shapes  can  be  modeled  and  many  types 
of  reliability  analyses  and  maintenance  planning  can  still  be  performed.  Consider  the  following 
examples  in  which  the  2-parameter  Weibull  distribution  is  used  to  model  the  crude  lifetimes.  Figure 
3  shows  the  density  and  distribution  functions  of  the  net  lifetime  when  the  crude  failure  distribution 
is  Weibull  with  shape  2.5  and  scale  900  and  the  crude  censoring  distribution  is  Weibull  with  shape 
1.5  and  scale  600.  Figure  4  shows  the  density  and  distribution  function  of  the  net  lifetime  when  the 
crude  failure  distribution  is  gamma  with  shape  3  and  scale  500  and  the  crude  censoring  distribution 
is  lognormal  with  shape  1.5  and  scale  600.  Many  other  distribution  shapes  can  result  by  varying  the 
Weibull  parameters  and  using  different  distributions  to  represent  crude  lifetimes.  The  benefits  of 
this  approach  lie  not  only  in  its  versatility,  but  in  the  fact  that  the  crude  lifetimes  are  simply  taken 
as  complete  samples  and  complete  sample  methods  may  be  used  to  test  the  goodness-of-fit  of  each. 


66 


1.2 


Net  Life  Density  Function  for  Weibull  Crude  Lifetimes 


xIO'® 


Net  Life  Distribution  Function  for  Weibull  Crude  Lifetimes 


Figure  3  Net  Lifetime  when  Crude  Lifetimes  are  WEI(2.5,  900)  and  WEI(1.5,  600)  at  50%  Ex¬ 
pected  Censoring. 

It  is  important,  however,  to  remember  that  when  conducting  a  simultaneous  test  of  hypothesis  that, 
as  a  result  of  Bonferroni’s  inequality,  an  overall  level  of  significance  of  at  most  a  is  maintained  by 
dividing  a  by  the  number  of  tests,  two  in  this  case,  so  the  level  of  significance  used  in  each  test 
should  be  f . 

An  application  of  this  test  procedure  is  demonstrated  on  a  randomly  right-censored  set  of 
leukemia  remission  times.  The  data  is  taken  from  [87:  p.  190].  In  the  study,  the  control  group 
consisted  of  a  sample  of  21  patients  who  were  treated  with  a  drug  named  6-mercaptopurine  (6-MP). 
The  cancer  resumed  in  9  of  the  patients  while  under  observation  whereas  12  were  still  in  remission 
until  they  were  no  longer  observable,  or  censored.  The  set  of  observed  remission  times  (measured 
in  weeks)  is 

6  6  6  7  10  13  16  22  23 


67 


Net  Life  Density  Function  for  Gamma  and  Lognormal  Crude  Lifetimes 


Net  Life  Distribution  Function  for  Gamma  and  Lognormal  Crude  Lifetimes 


Figure  4  Net  Lifetime  when  Crude  Lifetimes  are  GAM(35  500)  and  LOGN(1.5,  600))  at  50% 
Expected  Censoring. 


and  the  set  of  withdrawal  times  is 


6  9  10  11  17  19  20  25  32  34  35. 


The  set  of  observed  remission  times  constitutes  a  complete  sample  of  crude  lifetimes  for  the  variable 
of  interest;  the  remission  time  of  a  leukemia  patient  taking  the  drug  6-MP.  Using  the  Anderson- 
Darling  test  for  complete  samples,  as  outlined  in  [32],  a  test  for  the  2-parameter  Weibull  distribution 
(transformed  to  the  extreme  value  distribution)  does  not  reject  the  composite  hypothesis  of  an 
underlying  Weibull  distribution  for  the  crude  life  of  the  variable  of  interest 

A^{1  +  ^)  =  0.5801,0.10  <p<  0.25 


68 


Similarly,  the  Anderson-Darling  test  with  the  composite  hypothesis  of  an  underlying  Weibull  dis¬ 
tribution  on  the  complete  sample  of  12  withdrawal  times  does  not  reject  the  Weibull  hypothesis 
+  =  0.4719, p  >  0.25^  The  maximum  likelihood  estimates  of  the  parameters  of  the 

2-parameter  Weibull  distribution  for  the  failure  times  are  ^  =  2.03  for  the  shape  and  rj  =  13.77  for 
the  scale.  Likewise,  the  estimates  of  the  Weibull  parameters  for  the  withdrawal  times  are  k  =  2.22 
for  the  shape  and  a  =  23.60  for  the  scale.  The  expected  proportion  of  censoring  is  estimated  by 
p  —  Using  Equation  25  and  numerical  integration  using  Simpson’s  rule  (see  Appendix  A)  , 

0.03 
0.025 
0.02 
g  0.01 5 
0.01 
0.005 
0 

0  10  20  30  40  50  60 

t 

Net  Life  Distribution  Function  for  Weibull  Crude  Lifetimes:  Leukemia  Data 
0.7 

0.6 

0.5 

_0.4 

^  0.3 

0.2 

0.1 

0 

0  10  20  30  40  50  60 

t 

Figure  5  Net  Lifetime  for  the  Leukemia  Remission  Times  when  Crude  Lifetimes  are  WEI{2.03, 
13.77)  and  WEI(2.22,  23.60))  at  57%  Censoring. 

plots  of  the  density  and  distribution  functions  for  the  underlying  distribution  of  the  net  time  of 
remission  were  constructed  and  are  shown  in  Figure  5.  Further,  the  distribution  function  estimate 
of  the  underlying  net  time  of  remission  is  plotted  with  the  Kaplan-Meier  estimate  in  Figure  6. 
The  Kaplan-Meier  estimate  was  constructed  from  the  censored  set  of  data  with  the  necessary  ad- 


Net  Life  Density  Function  for  Weibull  Crude  Lifetimes:  Leukemia  Data 


69 


justments  made  to  accommodate  the  ties  [87].  The  estimate  appears  quite  good  in  spite  of  such 
a  small  sample  and  relatively  high  censoring.  Interestingly,  the  resulting  distribution  of  the  net 
lifetime  of  remission  is  a  split  model,  the  phenomenon  discussed  in  Appendix  B.  Split  models 
have  been  applied  in  recidivism  and  economics  by  Schmidt  and  Witte  [118, 119].  This  example 
indicates  that  the  split  model  may  also  be  appropriate  the  leukemia  remission  times  in  the  study 
of  the  drug  6-MP.  Statistically  speaking,  the  implication  in  this  example  is  that  the  cancer  will 
resume  in  approximately  58%  of  the  patients  who  are  treated  with  6-MP,  leaving  the  remaining 
42%  leukemia-free  for  life.  Before  taking  such  strong  claims  too  seriously,  however,  keep  in  mind 
the  small  sample  size  used  in  the  study. 


3,8  New  Semi-Parametric  Goodness- of -Fit  Tests  Based  on  Crude  Lifetimes 

Other  goodness-of-fit  tests  can  be  constructed  using  the  crude  failure  and  censoring  times 
where  a  parametric  fit  is  obtained  only  for  the  crude  failure  times  and  the  EDF  of  the  crude 
censoring  times  is  used  in  Equation  25  to  find  the  distribution  of  the  underlying  net  failure  process. 
This  approach  eliminates  the  need  to  fit  a  distribution  to  the  crude  censoring  variable.  Furthermore, 
aside  from  independence  of  T,  no  assumption  about  the  censoring  distribution  is  necessary  at  all, 
which  distinguishes  the  following  procedure  from  the  rest.  The  procedure  to  test  the  null  hypothesis 
Ho  :  Fyt  €  (t>  =  (01, ... ,  ^kY  €  is  simply  to  perform  any  complete  sample  goodness-of-fit 

test  on  the  set  of  failure  times  only  and,  if  Hq  is  not  rejected,  the  partially  parametric  hazard 
function  of  the  underlying  net  failure  process  is  then  given  by 


hrit)  = 


_ pfvr  {^1  ^mle) _ 

PSyt{^^  ^mle)  "f"  (1  “  P)SYc,n{^) 


(27) 


70 


where  SYT,n{t)  is  the  empirical  survivor  function  of  the  crude  lifetime  of  the  censoring  variable  and 


is  defined  as 


SYT,n{^)  S 


1,  t  <  C(1) 


0,  t  ^ 


=  1, ... ,71  —  r  -  1 


The  power  of  this  test  is  the  same  as  that  of  any  complete  sample  test  for  a  given  distribution  with 
sample  size  equal  to  the  size  of  the  failure  set  in  the  censored  sample. 


To  demonstrate  the  usefulness  of  this  procedure,  it  is  applied  to  a  randomly  right-censored 
set  of  survival  time  of  211  stage  IV  prostate  cancer  patients  who  have  been  treated  with  estrogen 
in  a  Veterans  Administration  Cooperative  Urological  Research  Group  study  [138].  This  data  has 
been  examined  by  several  authors.  In  the  study,  90  patients  died  of  prostate  cancer,  105  died  due 
to  other  causes,  and  16  were  still  alive.  Therefore,  we  have  90  failure  times  and  121  withdrawal 
times.  The  actual  data  set  can  be  found  in  [42,66, 138].  Prior  analysis  in  the  form  of  a  simple 
test  of  hypothesis  assuming  an  underlying  exponential  distribution  with  a  mean  of  100  weeks 
has  been  performed  by  Koziol  and  Green,  Hollander  and  Proschan,  Ebrahimi  and  Habibullah, 
and  Csorgd  and  Horvath.  Hollander  and  Proschan  performed  Hyde’s  test  on  the  data  as  well  as 
their  own.  Further,  Hollander  and  Proschan  performed  the  Koziol  and  Green  test  and  arrived  at 
different  results  than  Koziol  and  Green’s  original  calculations  [66].  The  results  of  Hollander  and 
Proschan  for  the  Koziol  and  Green  test  are  the  ones  given  in  Table  13.  C.  H.  Chen  performed  a 
composite  test  of  hypothesis  for  the  exponential  distribution  using  his  correlation  test  statistic.  For 
comparison  purposes,  Burke’s  test  for  exponentiality  was  performed  under  a  composite  hypothesis 
as  well  as  the  KME-modified  Cramer-von  Mises  and  Anderson-Darling  tests  as  presented  in  this 
dissertation.  The  KME-modified  Cramer-von  Mises  and  Anderson-Darling  tests  as  well  as  Burke’s, 
Chen’s,  and  Csorgo  &  Horvath’s  tests  reject  the  hypothesis  of  exponentiality  at  a  =  0.05.  The 
results  of  these  tests  are  summarized  in  Table  13.  Using  the  set  of  90  failure  times,  measured 
in  weeks,  and  the  Cramer-von  Mises  test  for  complete  samples,  as  outlined  in  [32],  a  test  for  the 


71 


Table  13  Summary  of  Test  Results  for  the  Prostate  Cancer  Data. 


Test 

Hypothesis 

p- value 

Comments 

Koziol  &  Green 

Simple,  EXP(IOO) 

0.14 

Use  q  =  0.60 

Hyde 

Simple,  EXP(IOO) 

0.86 

Use  q  =  0.573 

Hollander  &  Proschan 

Simple,  EXP(IOO) 

0.49 

Use  q  =  0.573 

Ebrahimi  &  Habibullah 

Simple,  EXP(IOO) 

0.66 

Use  q  =  0.573 

Csorgo  &  Horvath 

Simple,  EXP(IOO) 

0.04 

Use  q  =  0.573 

C.  H.  Chen 

Composite,  EXP 

0.026 

Use  q  =  0.573 

Burke 

Composite,  EXP 

p  <  0.025 

Use  q  =  0.50,  n  =  100 

(Reineke) 

Composite,  EXP 

p  <  0.025 

Use  q  =  0.60,  n  =  200 

n  (Reineke) 

Composite,  EXP 

p  <  0.025 

Use  q  =  0.60,  n  =  200 

2-paremeter  Weibull  distribution  (transformed  to  the  extreme  value  distribution)  does  not  reject 
the  composite  hypothesis  of  an  underlying  Weibull  distribution  for  the  crude  life  of  the  prostate 
cancer  survival  time  =  0.9063, 0.10  <p  <  0.25^.  The  maximum  likelihood  estimates 

of  the  parameters  of  the  2-parameter  Weibull  distribution  for  the  prostate  cancer  survival  times 
are  =  0.89  for  the  shape  and  fj  =  34.14  for  the  scale.  None  of  the  typical  lifetime  distributions 
fit  the  crude  life  of  the  censoring  times  so  the  semi-parametric  crude  life  procedure  was  used.  The 
empirical  survivor  function  of  the  set  of  censoring  times  is  shown  in  Figure  7.  Figure  8  shows  the 
juxtaposition  of  distribution  function  estimates  from  the  semi-parametric  crude  life  procedure,  the 
Kaplan-Meier  estimator,  the  maximum  likelihood  estimator  for  the  exponential  distribution,  and 
the  hypothesized  exponential  distribution  with  mean  100.  It  would  appear  that  the  estimate  from 
the  semi-parametric  crude  life  test  more  closely  resembles  the  Kaplan-Meier  estimate,  which  should 
be  relatively  good  for  this  sample  size.  This  further  indicates  that  the  exponential  distribution  does 
not  appear  to  be  a 
for  prostate  cancer 


suitable  parametric  model  for  the  underlying  distribution  of  the  survival  time 
patients  treated  with  estrogen  as  in  this  study. 


72 


Distribution  Function  of  Leukemia  Remission  Times 


Empirical  Survivor  Function  for  Crude  Life  of  Censoring  Times:  Prostate  Cancer  Data 


Figure  7  Empirical  Survivor  function  of  Crude  Life  Censoring  Times  for  Semi-Parametric  Crude 
Life  Test. 


74 


3. 9  Power  Studies 


3. 9 A  Exponential  Failure  with  Exponential  Censoring.  The  power  of  a  goodness-of-fit 
test  measures  how  well  the  test  identifies  distributions  that  are  different  from  the  one  specified 
by  the  null  hypothesis.  Generally,  for  one-sided  test  procedures  such  as  these,  we  reject  the  null 
hypothesis  and  conclude  that  the  underlying  random  variable  is  not  exponentially  distributed  when 
the  observed  test  statistic  is  greater  than  the  critical  value  for  a  given  sample  size,  proportion  of 
censoring,  and  significance  level.  The  empirical  size  of  the  test  is  given  by  the  column  in  which  the 
true  distribution  is  the  one  specified  in  the  null  hypothesis,  in  this  case  the  exponential  distribution. 
If  the  test  is  unbiased,  the  size  should  be  equal  to  the  level  of  significance  a  that  was  used  to 
determine  the  percentage  point  used  as  the  critical  value  for  the  hypothesis  test. 

Goodness-of-fit  tests  were  conducted  and  compared  in  a  Monte  Carlo  study  to  compare  tests 
using  the  modified  Cramer-von  Mises  and  Anderson-Darling  statistics  to  each  other  as  well  as  to 
statistics  derived  by  Burke  and  C.H.  Chen  and  to  examine  their  effectiveness  in  detecting  departures 
from  a  randomly  censored  exponential  distribution  when  the  assumption  of  an  exponential  censoring 
distribution  is  correct.  Power  studies  for  the  simultaneous  tests  of  crude  lifetimes  (STCL)  of  Section 
3.7  are  also  included.  An  exact  comparison  of  the  power  of  the  STCL  to  the  other  goodness-of-fit 
procedures  is  only  possible  under  the  scenario  of  an  exponential  lifetime  subject  to  an  exponentially 
distributed  random  censoring  variable.  The  reason  for  this  is  that  in  the  competing  risks  model 
with  independent  risks,  exponentially  distributed  net  lifetimes  always  correspond  to  exponentially 
distributed  crude  lifetimes.  Thus,  we  conduct  complete  sample  goodness-of-fit  procedures  for  the 
exponential  distribution  with  the  STCL.  As  proof  of  this,  consider  genaralizing  the  work  of  Leemis 
[87:  Example  5.2].  Suppose  we  have  two  independent  exponentially  distributed  net  lifetimes  with 
survivor  functions 


5T(t)=e-^,  t>0,\>0 


76 


and 


Sc(c)=e~^,  c>0,9>0. 


The  crude  survivor  function  of  the  risk  associated  with  the  net  lifetime  of  interest,  T,  is 


Syt  (Vt)  - 


— 


P[X>yT,S  =  l] 
P[5^1] 

P[X>yT,T<C] 

P[T<C] 

(^) 


(a%) 


The  crude  lifetime  associated  with  the  other  net  risk  is  found  similarly  as 


Syc  {vg) 


P[X>yG,5  =  Q] 

P[5  =  Q] 

P[X>y,,C<T] 
P[C  <  T] 

fZ  [XT 

fgcx>je-i-fdt 

C dc 
'fyc  A _ 

(aTs) 

(^) 

_y£i_y£i  _ 

e  ^  ^  ,yc>0. 


77 


yielding  the  same  exponential  distribution  as  the  crude  lifetime  for  the  first  risk.  Conversely,  the 
only  way  to  obtain  exponentially  distributed  net  lifetimes  from  crude  lifetimes  using  equation  25  is 
when  the  crude  lifetimes  follow  the  same  exponential  distribution.  This  is  the  only  way  to  obtain 
a  constant  hazard  rate  for  each  net  lifetime.  With  this  in  mind,  power  studies  can  be  conducted 
using  the  STCL  on  the  exact  same  distributions  and  alternatives  as  the  KME-modified  Cramer- von 
Mises  and  Anderson-Darling  tests,  Burke’s  test  and  Chen’s  test. 

One  thousand  tests  of  each  type  were  conducted  for  sample  sizes  20(20)200,  expected  propor¬ 
tions  of  censoring  0.10(0.10)0.90,  and  levels  of  significance  a  =  0,10, 0.05,  and  0.025.  The  empirical 
power  of  each  test  was  observed  to  be  the  ratio  of  the  number  of  times  the  observed  test  statistic 
was  bigger  than  the  appropriate  percentage  point  to  the  total  number  of  tests.  Empirical  powers 
are  given  in  Tables  30  through  38  in  Appendix  F.  Figures  43  through  57  in  Appendix  F  display 
plots  of  the  empirical  power  of  each  test  for  the  exponential  distribution  in  each  case  examined.  The 
censoring  distribution  in  all  simulations  was  exponential  with  the  scale  parameter  adjusted  to  pro¬ 
vide  the  amount  of  expected  censoring  for  each  case.  It  should  also  be  noted  that  samples  with  less 
than  two  failures  were  excluded  due  to  computational  difficulties.  Furthermore,  the  samples  used 
in  the  power  study  were  generated  using  parameter  values  that  matched  the  expected  proportion  of 
censoring  for  each  case.  The  alternative  distributions  used  in  the  power  study  for  the  exponential 
are  plotted  with  the  hypothesized  exponential  distribution  in  Figure  9.  The  Weibull  with  shape  2 
differs  the  most  from  the  exponential  and  should  therefore  be  the  easiest  alternative  for  the  test 
statistics  to  detect.  The  gamma  distribution  with  shape  1.5  and  the  lognormal  distribution  are 
more  difficult  alternatives  to  detect,  thus  the  statistics  will  display  lower  power  against  them. 

The  power  of  each  test  is  dependent  on  the  sample  size,  the  amount  of  censoring,  and  the 
underlying  distribution.  Naturally,  as  the  sample  size  is  increased,  the  power  also  increases.  In  con¬ 
trast,  as  the  level  of  censoring  increases,  the  power  of  each  test  decreases.  As  with  goodness-of-fit 
tests  for  complete  samples,  power  is  generally  higher  for  alternative  distributions  whose  cumulative 


78 


Weibull  (2)  Alternative 


Gamma  (1 .5)  Alternative 


Gamma  (2)  Alternative 

0.7 
0.6 
0.5 
0.4 
0.3 
0.2 
0.1 
0 

0  5  10  15  0  5  10  15 

Exponential:  Solid  Line 
Alternative:  Dotted  Line 

Figure  9  PDF’s  of  Alternative  Distributions  Used  in  the  Power  Study  of  Tests  for  Exponentiality. 


Lognormal  from  N(0,1)  Alt. 


79 


distribution  functions  differ  to  a  greater  degree  in  shape  than  the  family  specified  by  the  null  hy¬ 
pothesis.  Keep  in  mind  that  when  comparing  two  competing  tests  for  the  same  set  of  hypotheses 
under  the  same  circumstances,  the  test  with  the  higher  power  is  considered  to  be  better.  There¬ 
fore,  when  a  relatively  powerful  goodness-of-fit  test  exhibits  low  power  against  a  given  alternative 
distribution,  one  could  interpret  that  as  meaning  that  the  alternative  distribution  may  be  just  as 
effective  at  modeling  the  underlying  process  as  the  distribution  specified  in  the  null  hypothesis. 

S.9,2  WeibuU  Failure  with  Exponential  Censoring.  In  the  following  power  study  the 
null  hypothesis  is  that  the  underlying  distribution  is  from  a  WeibuU  family  with  shape  parameter 
13  =  2.  The  empirical  size  of  the  test  is  given  by  the  column  in  which  the  true  distribution  is  the 
one  specified  in  the  null  hypothesis,  WeibuU  with  shape  2.  If  the  test  is  unbiased,  the  size  should 
be  equal  to  the  level  of  significance  a  that  was  used  to  determine  the  percentage  point  used  as 
the  critical  value  for  the  hypothesis  test.  The  alternative  distributions  used  in  the  power  study  for 
the  WeibuU  with  shape  2  are  plotted  with  the  hypothesized  distribution  in  Figure  10.  All  of  the 
alternatives  chosen  for  this  power  study  closely  resemble  the  hypothesised  WeibuU  with  shape  2. 
Empirical  powers  of  and  A^^^observed  from  1000  tests  of  each  type  are  given  in  tables  14,  15, 
and  16  for  sample  sizes  20(20)100  and  expected  proportions  of  censoring  0.20,  0.50,  and  0.80.  The 
goodness-of-fit  tests  for  the  WeibuU  distribution  with  unknown  location  and  scale  were  conducted 
as  outlined  in  Section  3.6.  Unfortunately,  distributions  of  the  crude  lifetimes  that  correspond  to  one 
WeibuU  and  one  exponential  net  lifetime  are  not  typical  parametric  distributions  for  which  testing 
procedures  are  available,  so  the  STCL  and  SPCLT  procedures  are  not  included  in  the  power  studies 
for  the  tests  for  WeibuU  distributions. 

A  power  study  was  also  conducted  for  the  WeibuU  family  with  known  shape  /3  =  3,5.  The 
alternative  distributions  used  in  the  power  study  for  the  WeibuU  with  shape  3.5  are  plotted  with  the 
more  symmetrically  shaped  version  of  the  WeibuU  in  Figure  11.  The  normal  distribution  matches 
up  very  closely  with  the  WeibuU  with  shape  3.5  and  so  we  can  expect  each  test  statistic  to  register 


80 


Weibull  (2.5)  Altemtative  Gamma  (2)  Alternative 


0  50  100  150  200  20  25  30  35  40 

Gamma  (2.5)  Alternative  Lognormal  from  N(0.4,.67)  Alt. 


20  25  30  35  40  20  22  24  26  28  30 


Weibull  (2);  Solid  Line 

Alternative:  Dotted  Line 

Figure  10  PDF’s  of  Alternative  Distributions  Used  in  the  Power  Study  of  Tests  for  the  Weibull 
(shape  0  =  2). 


81 


Table  14  Empirical  Power  of  Modified  ^r,n  Statistics  at  a  =  0.10. 

_ Weibull  Distribution  (Shape==2)  with  Exponential  Censoring* _ 


A] 

ternative  Distribution 

Weibull 

Weibull 

Gamma 

Gamma 

Lognormal 

Shape=2 

shape: 

=2.5 

shape=2 

shape 

=2.5 

from  N(0.4,0.67) 

q  =  0.20 

mm 

mam 

EMB 

mm 

wmm 

msm 

0.098 

0.110 

0.111 

0.086 

0.338 

0.369 

0.225 

0.236 

0.094 

0.090 

0.183 

0.133 

0.530 

0.573 

0.334 

0.342 

0.084 

0.075 

0.283 

0.241 

0.675 

0.720 

0.417 

0.429 

B^BS 

0.107 

0.100 

0.372 

0.326 

0.795 

0.827 

0.519 

0.531 

0.942 

0.941 

0.093 

0.101 

0.464 

0.431 

0.869 

0.896 

0.605 

0.622 

0.968 

0.967 

q  =  0.50 
n  =  20 

0.096 

0.095 

0.090 

0.060 

0.172 

0.209 

0.130 

0.152 

0.423 

0.087 

0.092 

0.164 

0.120 

0.311 

0.373 

0.180 

0.204 

0.468 

0.480 

0.093 

0.090 

0.240 

0.179 

0.432 

0.515 

0.262 

0.282 

0.577 

0.579 

0.111 

0.108 

0.297 

0.237 

0.505 

0.587 

0.288 

0.320 

0.653 

0.656 

0.102 

0.106 

0.353 

0.309 

0.589 

0.677 

0.348 

0.408 

0.711 

0.726 

q  =  0.80 
n  =  20 

0.099 

0.103 

0.072 

0.065 

0.099 

0.105 

0.078 

0.082 

mmam 

n  =  40 

0.102 

0.095 

0.056 

0.047 

0.141 

0.151 

0.106 

0.116 

BBBU 

n  =  60 

0.104 

0.099 

0.087 

0.066 

0.164 

0.167 

0.131 

0.134 

0.140 

0.148 

n  =  80 

0.110 

0.105 

0.098 

0.064 

0.200 

0.225 

0.129 

0.142 

0.155 

0.152 

n  =  100 

0.107 

0.099 

0.129 

0.087 

0.233 

0.248 

0.147 

0.155 

0.164 

0.160 

*Boldfaced  numbers  are  significantly  higher 


low  power  in  detecting  the  normal  alternative.  However,  we  should  expect  relatively  high  power 
in  the  detection  of  both  of  the  gamma  alternatives  and  moderate  power  in  detecting  the  change 
in  shape  parameter  for  an  underlying  Weibull  distribution.  Empirical  powers  of  and  „ 
observed  from  1000  tests  for  each  sample  size,  proportion  of  censoring,  and  significance  level  are 
given  in  Tables  17,  18,  and  19  for  sample  sizes  20(20)100  and  expected  proportions  of  censoring 
0.20,  0.50,  and  0.80.  The  goodness-of-fit  tests  for  the  Weibull  distribution  with  unknown  location 
and  scale  were  conducted  according  to  the  procedure  outlined  in  Section  3.6. 


82 


Table  15  Empirical  Power  of  Modified  and  Statistics  at  a  =  0.05. 
_ Weibull  Distribution  (Shape=2)  with  Exponential  Censoring* _ 


A] 

ternative  Distribution 

Weibull 

Weibull 

Gamma 

Gamma 

Lognormal 

Shape=2 

CD 

=2.5 

shape=2 

shape 

=2.5 

from  N(0.4,0.67) 

q  =  0.20 

Ban 

■M 

ESHI 

n  =  20 

0.056 

0.052 

0.070 

0.044 

0.221 

0.255 

0.142 

0.148 

WMM 

0.692 

n  =  40 

0.043 

0.039 

0.108 

0.082 

0.416 

0.429 

0.233 

0.236 

Wm 

0.774 

II 

s 

0.036 

0.033 

0.192 

0.167 

0.569 

0.606 

0.319 

0.320 

0.852 

n  =  80 

0.050 

0.062 

0.271 

0.238 

0.694 

0.731 

0.419 

0.421 

0.920 

0.919 

n  =  100 

0.052 

0.052 

0.359 

0.326 

0.801 

0.828 

0.492 

0.511 

0.954 

0.948 

q  =  0.50 
n  =  20 

0.056 

0.050 

0.036 

0.019 

0.105 

0.130 

0.066 

0.088 

m 

0.359 

n  =  40 

0.042 

0.048 

0.102 

0.049 

0.204 

0.268 

0.100 

0.125 

119 

0.403 

n  =  60 

0.054 

0.048 

0.159 

0.093 

0.315 

0.385 

0.165 

0.207 

0.500 

0.488 

71  =  80 

0.044 

0.049 

0.186 

0.139 

0.374 

0.450 

0.187 

0.225 

0.570 

0.573 

77  =  100 

0.043 

0.055 

0.234 

0.181 

0.483 

0.577 

0.254 

0.298 

0.641 

0.632 

q  =  0.80 
n  =  20 

0.054 

0.058 

0.028 

0.023 

0.063 

0.066 

0.033 

0.036 

IHI 

0.030 

77  =  40 

0.045 

0.047 

0.015 

0.010 

0.079 

0.089 

0.061 

0.063 

0.052 

77  =  60 

0.048 

0.050 

0.026 

0.011 

0.107 

0.116 

0.079 

0.082 

0.063 

77  =  80 

0.053 

0.055 

0.032 

0.014 

0.126 

0.139 

0.073 

0.089 

0.097 

77  =  100 

0.048 

0.051 

0.044 

0.013 

0.153 

0.157 

0.085 

0.103 

0.104 

*Boldfaced  numbers  are  significantly  higher 


Table  16  Empirical  Power  of  Modified  and  Statistics  at  a  =  0.025. 
_ Weibull  Distribution  (Shape=2)  with  Exponential  Censoring* _ 


Alternative  Distribution 

Weibull 

Weibull 

Gamma 

Gamma 

Lognormal 

Shape=2 

shape: 

=2.5 

shape=2 

shape 

=2.5 

from  N(0.4,0.67) 

q  =  0.20 

BUI 

mam 

ESaiM 

n  =  20 

0.028 

0.030 

0.035 

0.023 

0.160 

0.166 

0.098 

0.111 

0.649 

II 

0.021 

0.019 

0.074 

0.042 

0.322 

0.332 

0.180 

0.172 

0.745 

n  =  60 

0.014 

0.015 

0.144 

0.104 

0.457 

0.496 

0.238 

0.244 

0.825 

n  —  80 

0.028 

0.034 

0.188 

0.158 

0.591 

0.628 

0.321 

0.325 

0.894 

0.893 

77  =  100 

0.027 

0.027 

0.262 

0.240 

0.712 

0.751 

0.400 

0.413 

0.930 

0.935 

q  =  0.50 

77  =  20 

0.022 

0.023 

0.007 

0.063 

0.076 

0.036 

0.053 

0.276 

n  =  4.0 

0.026 

0.025 

0.048 

0.021 

0.129 

0.171 

0.063 

0.085 

0.351 

0.349 

11 

g 

0.031 

0.024 

0.097 

0.047 

0.219 

0.290 

0.109 

0.144 

liSI 

0.426 

n  =  80 

0.025 

0.023 

0.126 

0.072 

0.277 

0.346 

0.124 

0.159 

119 

0.510 

77  =  100 

0.023 

0.028 

0.149 

0.093 

0.377 

0.461 

0.175 

0.218 

0.566 

q  =  0.80 
n  =  20 

0.033 

0.033 

0.011 

0.007 

0.030 

0.033 

0.022 

0.022 

0.010 

0.010 

n  =  40 

0.023 

0.024 

0.006 

0.005 

0.042 

0.054 

0.037 

0.042 

0.024 

0.024 

II 

8 

0.021 

0.024 

0.006 

0.002 

0.069 

0.078 

0.040 

0.047 

0.031 

0.024 

n  =  80 

0.028 

0.027 

0.009 

0.003 

0.067 

0.076 

0.038 

0.038 

0.056 

0.053 

77  =  100 

0.025 

0.029 

0.012 

0.002 

0.092 

0.100 

0.060 

0.059 

0.067 

0.059 

*Boldfaced  numbers  are  significantly  higher 


84 


Table  17  Empirical  Power  of  Modified  and  Statistics  at  a  =  0.10. 
Weibull  Distribution  (Shape=3.5)  with  Exponential  Censoring* 


*B oldfaced  numbers  are  significantly  higher 

Table  18  Empirical  Power  of  Modified  and  Statistics  at  a  =  0.05. 
Weibull  Distribution  (Shape=3.5)  with  Exponential  Censoring* 


*B oldfaced  numbers  are  significantly  higher 


85 


Table  19  Empirical  Power  of  Modified  and  Statistics  at  a  =  0.025. 
Weibull  Distribution  (Shape=3.5)  with  Exponential  Censoring* 


Alternative  Distribution 

Weibull 

Shape=3.5 

■■ggyBII 

Gamma 

shape=3.5 

Gamma 

shape=4 

N(45,14) 

q  =  0.20 

mm 

ERR 

■sm 

wEm 

KSH 

mam 

n  =  20 

0.022 

0.034 

0.053 

0.056 

0.304 

0.229 

0.309 

0.233 

0.020 

0.017 

n  =  40 

0.019 

0.015 

0.083 

0.069 

0.574 

0.520 

0.617 

0.547 

0.021 

0.017 

n  =  60 

0.015 

0.020 

0.113 

0.122 

0.689 

0.661 

0.781 

0.763 

0.015 

0.012 

n  =  80 

0.023 

0.033 

0.196 

0.193 

0.757 

0.786 

0.831 

0.841 

0.021 

0.020 

o 

o 

1 — 1 

II 

0.025 

0.026 

0.239 

0.287 

0.881 

0.929 

0.894 

0.922 

0.028 

0.024 

1  q  =  0.50 

0.021 

0.019 

0.048 

0.069 

0.248 

0.317 

0.232 

0.286 

0.030 

0.022 

0.025 

0.029 

0.062 

0.102 

0.553 

0.657 

0.485 

0.578 

0.023 

0.019 

0.028 

0.017 

0.092 

0.142 

0.750 

0.824 

0.698 

0.796 

0.024 

0.019 

0.029 

0.026 

0.096 

0.164 

0.860 

0.923 

0.826 

0.896 

0.032 

0.032 

0.026 

0.029 

0.141 

0.229 

0.924 

0.963 

0.903 

0.951 

0.035 

0.037 

1  q  =  0.80 

0.028 

0.028 

0.048 

0.051 

0.056 

0.076 

0.049 

0.065 

0.020 

0.021 

0.024 

0.025 

0.054 

0.060 

0.104 

0.151 

0.115 

0.158 

0.026 

0.022 

0.025 

0.025 

0.050 

0.071 

0.203 

0.267 

0.186 

0.231 

0.016 

0.018 

w 

0.023 

0.016 

0.064 

0.085 

0.315 

0.376 

0.244 

0.312 

0.014 

0.015 

0.017 

0.022 

0.079 

0.110 

0.379 

0.426 

0.314 

0.386 

0.017 

0.015 

*Boldfaced  numbers  are  significantly  higher 


86 


IV,  Summary  and  Conclusions 


In  addition  to  a  historical  perspective  on  the  random  censoring  problem,  maximum  likelihood 
and  minimum  distance  estimation  methods  for  a  3-parameter  Weibull  distribution  subject  to  an 
exponentially  censoring  distribution  were  presented.  Computing  formulas  were  derived  for  the 
Cramer- von  Mises  and  Anderson-Darling  statistics  to  be  used  as  distance  estimators  in  the  case  of 
randomly  censored  samples  with  the  EDF  in  each  statistic  replaced  by  the  Kaplan-Meier  estimator. 
It  was  demonstrated  that  minimum  distance  estimation  can  be  effectively  used  to  estimate  location 
parameters  when  samples  contain  censored  items,  particularly  when  used  in  tandem  with  maximum 
likelihood  estimation  of  the  shape  and  scale  parameters.  The  effectiveness  of  that  technique  was 
demonstrated  for  sample  sizes  20  and  60  with  expected  proportions  of  censoring  0.25,  0.50,  and 
0.75.  A  comparison  of  initial  location  parameter  estimates  for  the  minimum  distance  estimation 
procedure  showed  that  the  technique  is  robust  to  the  choice  of  initial  estimates  of  the  location 
parameter.  The  comparison  also  revealed  that  there  is  no  significant  difference  in  the  estimators 
regardless  of  whether  the  Cramer- von  Mises  or  the  Anderson-Darling  distance  measure  is  used. 

Four  known  nonparametric  distribution  function  estimators  were  outlined  in  addition  to  the 
introduction  of  a  new  continuous  estimator  in  the  form  of  a  trigonometrically-smoothed  and  jack¬ 
knifed  product-limit  estimator.  Two  semi-parametric  estimators  were  presented  and  all  of  the 
estimation  techniques  were  compared  under  various  levels  of  censoring  and  for  a  variety  of  distribu¬ 
tion  shapes  using  mean  integrated  squared  error,  Kruskal- Wallis  tests,  and  side-by-side  boxplots  to 
determine  differences  in  ISE  among  the  estimators.  Due  to  the  inherent  skewness  of  ISE,  we  do  not 
recommend  comparing  distribution  function  estimators  on  the  basis  if  MISE  alone.  We  recommend 
that  Kruskal- Wallis  tests  be  used  to  compare  ISE  populations  to  judge  the  relative  effectiveness 
estimators  in  addition  to  the  examination  of  side-by-side  boxplots.  The  results  of  that  study  show 
that  maximum  likelihood  is  the  best  estimator  given  that  the  distribution  is  correctly  specified. 
The  comparison  of  estimators  also  revealed  that  the  PEXE  and  FRWE  are  the  best  nonparamet- 


87 


ric  estimators  for  both  skewed  and  symmetric  underlying  distributions  under  light,  moderate,  and 
heavy  censoring  conditions.  However,  the  FRWE  and  BSE  require  considerable  computer  time  to 
construct  in  comparison  with  the  non-kernel-type  estimators.  Trigonometric  smoothing  was  shown 
to  neither  improve  nor  diminish  the  ability  of  the  Kaplan-Meier  estimator  in  terms  of  ISE  while  the 
jackknifing  of  the  trigonometrically-smoothed  KME  increased  the  ISE  at  25%  expected  censoring. 
The  quality  of  distribution  function  estimation  produced  by  the  MONE  is  only  worse  than  the 
KME  when  censoring  is  high,  otherwise  the  two  methods  are  not  significantly  different  in  terms  of 
ISE.  The  kernel  estimator  of  Blum  and  Susarla  was  the  worst  of  the  estimators,  especially  at  50% 
and  80%  expected  censoring.  The  KME  is  the  easiest  and  fastest  to  compute  while  the  kernel-type 
estimators  require  the  most  computer  time  to  construct. 

A  discussion  of  the  asymptotic  theory  of  goodness-of-fit  statistics  for  composite  tests  of  hy¬ 
pothesis  based  on  the  difference  between  the  Kaplan-Meier  estimator  and  the  maximum  likelihood 
estimator  was  provided.  Current  theory  was  summarized  for  the  exponential  distribution  when 
the  censoring  distribution  is  also  exponential  and  for  the  Weibull  distribution  when  the  censoring 
distribution  has  a  hazard  function  that  is  proportional  that  of  the  failure  distribution.  The  Efron 
transform,  which  was  adapted  from  Doob’s  work  in  stochastic  processes,  is  a  method  of  transform¬ 
ing  a  general  Gaussian  process  to  a  Brownian  motion  process.  That  is,  the  limiting  behavior  of  the 
Kaplan-Meier  estimator  can  be  transformed  to  a  known  process  for  which  percentage  points  can 
be  obtained  analytically  in  the  case  of  a  simple  hypothesis.  Unfortunately,  it  is  shown  that  this 
transform  is  no  longer  possible  when  the  hypothesis  is  composite  and  parameters  are  estimated. 
Problems  with  the  Efron  transform  in  the  case  of  estimated  parameters  were  discussed  for  both  the 
Weibull  and  exponential  distributions. 

Finally,  several  new  goodness-of-fit  tests  were  developed  and  introduced.  This  class  of  new 
tests  is  based  on  replacing  the  EDF  with  the  KME  in  the  Cramer- von  Mises  and  Anderson-Darling 
statistics  since  the  EDF  is  no  longer  available  when  samples  are  randomly  censored.  Although  the 


88 


PEXE  and  FRWE  performed  significantly  better  than  the  KME  in  the  comparison  of  estimators, 
the  KME  was  chosen  as  the  substitute  for  the  EDF  in  these  goodness-of-fit  statistics  because  it  is 
very  easy  to  compute,  numerical  methods  are  not  required,  exact  computing  formulas  are  available, 
it  reduces  to  the  EDF  when  no  censoring  is  present,  and  it  is  known  to  converge  more  rapidly 
than  the  FRWE  [48].  Procedures  for  testing  the  hypothesis  of  exponentiality  are  given  in  the 
case  of  a  exponentially  distributed  censoring  distribution.  Procedures  are  also  given  for  testing 
the  null  hypothesis  of  a  Weibull  distribution  with  shape  2  and  a  Weibull  with  shape  3.5,  both  in 
the  case  of  an  exponentially  distributed  random  censoring  variable.  As  part  of  these  new  tests, 
computing  formulas  are  derived  and  the  location  and  scale  invariance  of  the  new  KME-modified 
Cramer-von  Mises  and  Anderson-Darling  statistics  are  examined.  Percentage  points  are  obtained 
through  Monte  Carlo  simulation  for  tests  for  the  exponential  with  estimated  scale  parameter  for 
sample  sizes  20(20)200  and  proportions  of  censoring  0.10(0.10)0.90.  Likewise,  percentage  points 
are  obtained  for  tests  for  the  Weibull  distribution  with  shape  2  and  estimated  location  and  scale 
and  for  the  Weibull  distribution  with  shape  3.5  with  estimated  location  and  scale  parameters  for 
sample  sizes  20(20)100  and  proportions  of  censoring  0.10(0.10)0.90.  This  class  of  goodness-of-fit 
tests  requires  the  assumption  of  an  exponentially  distributed  random  censoring  variable.  Some 
justifications  of  this  assumption  are  ofiered  through  Drenick’s  Theorem  [37]  in  Section  3.3. 

For  the  test  of  exponentiality  with  an  exponential  censoring  distribution,  the  powers  of  the 
KME-modified  Cramer-von  Mises  and  Anderson-Darling  statistics  were  compared  with  the  powers 
of  existing  procedures  by  Burke  and  C.  H.  Chen.  The  simultaneous  test  of  crude  lifetimes  (STCL) 
was  also  performed  on  the  exponential  distribution  with  an  exponential  censoring  variable  using 
both  the  complete  sample  Cramer-von  Mises  and  Anderson-Darling  test  statistics.  Chen’s  test  had 
the  lowest  power  but  a  comparison  was  only  available  at  20%  censoring.  Burke’s  test,  too,  had  lower 
power  than  the  KME-modified  statistics  in  most  cases,  particularly  for  the  sample  sizes  under  60. 
Burke’s  test  did,  however,  tend  to  catch  up  in  terms  of  power  for  samples  of  size  80  or  more.  The 
KME-modified  Cramer-von  Mises  and  Anderson-Darling  statistics  are  nearly  equivalent  in  power 


89 


and  displayed  relatively  higher  power,  in  general,  than  the  statistics  proposed  by  Burke  and  Chen. 
The  empirical  powers  of  the  STCL  tests  were  either  statistically  equal  to  that  of  the  other  tests  or 
statistically  greater  than  all  of  the  other  tests,  but  were  never  significantly  lower  than  any  other 
goodness-of-fit  test  in  the  study. 

When  testing  the  hypothesis  of  an  underlying  Weibull  distribution  under  the  assumption 
of  an  exponentially  distributed  censoring  variable,  there  were  no  existing  tests  to  compare  the 
KME-modified  Cram^-von  Mises  and  Anderson-Darling  statistics  to.  For  an  underlying  Weibull 
distribution  with  shape  2,  the  KME-modified  Cramer- von  Mises  dominated  the  KME-modified 
Anderson-Darling  in  detecting  a  Weibull  with  shape  2.5.  Conversely,  the  KME-modified  Anderson- 
Darling  was  dominant  over  the  KME-modified  Cramer- von  Mises  in  detecting  a  gamma  distribution 
with  shape  2  when  testing  for  the  Weibull  with  shape  2.  It  is  evident  in  Tables  17,  18,  and  19 
that  the  KME-modified  Anderson-Darling  performed  significantly  better  in  terms  of  power  than 
the  KME-modified  Cramer- von  Mises  in  many  cases. 

Another  class  of  goodness-of-fit  tests  was  constructed  using  the  competing  risks  concept  of 
crude  lifetimes  for  the  failure  and  censoring  variables.  Simultaneous  tests  of  fit  are  performed  on  the 
crude  lifetimes  using  the  failure  set  and  the  withdrawal  set  each  taken  as  a  complete  sample.  The 
censoring  distribution  is  no  longer  assumed  to  be  exponentially  distributed  but  it  is  still  necessary 
to  some  parametric  family  to  the  crude  lifetimes  of  the  censored  items.  After  fitting  distributions 
to  both  the  crude  failure  and  censoring  times,  one  can  obtain  the  hazard  function,  and  hence  the 
density,  survivor  and  distribution  functions,  of  the  underlying  variable  of  interest.  This  is  referred 
to  in  competing  risks  theory  as  a  net  lifetime  and  represents  the  distribution  of  the  variable  of 
interest  if  it  was  the  only  risk  acting  on  the  population.  That  is,  if  it  was  uncensored.  The  benefit 
of  this  type  of  test  is  that  existing  complete  sample  goodness-of-fit  procedures  may  be  used  and 
tables  of  percentage  points  are  readily  available.  Another  advantage  is  that  the  power  of  goodness- 
of-fit  tests  of  this  type  is  not  dependent  on  the  amount  of  censoring  or  the  assumption  of  an 


90 


exponentially  distributed  censoring  variable.  This  procedure  was  applied  to  the  case  of  testing  for 
an  underlying  exponential  distribution  censored  by  an  exponentially  distributed  random  variable 
and  displayed  power  at  least  as  high  as  the  KMEi-modified  Cramer- von  Mises  and  Anderson-Darling 
tests.  The  drawback  is  that  numerical  methods  will  most  likely  be  needed  to  convert  the  crude 
lifetime  distributions  to  the  net  lifetime  distribution  of  the  underlying  process  and  the  resulting 
density,  survivor,  and  distribution  functions  may  not  be  very  easy  to  manipulate.  Nevertheless, 
many  types  of  reliability  analyses  and  maintenance  planning  may  still  be  performed  within  this 
framework. 

Finally  new  goodness-of-fit  procedures  are  developed  again  based  on  the  relationship  of  the 
crude  lifetimes  to  the  net  lifetimes.  These  tests  are  considered  partially  parametric  in  that  they 
require  no  parametric  assumption  on  the  distribution  of  the  censoring  random  variable.  The  em¬ 
pirical  survivor  function  of  the  crude  withdrawal  times  is  all  that  is  necessary  to  adjust  the  crude 
lifetime  distribution  to  yield  the  net  lifetime  distribution  of  the  underlying  cause  of  failure.  The 
hypothesis  is  conducted  on  the  crude  failure  times  only.  Existing  complete  sample  goodness-of-fit 
procedures  may  be  used  for  this  and  the  power  of  this  type  of  test  is  the  same  as  that  of  any  test 
on  a  complete  sample  of  the  same  sample  size. 

Using  crude  lifetimes  to  estimate  and  characterize  the  net  lifetime  of  interest  in  cases  of 
randomly  censored  data  broadens  the  field  of  reliability  analysis  in  such  a  way  as  to  include  heavily 
censored  data  that  would  have  previously  been  unusable.  It  also  brings  much  greater  flexibility  to 
goodness-of-fit  testing  with  randomly  censored  samples  because  any  of  the  existing  goodness-of-fit 
tests  for  complete  samples  may  be  used  to  try  to  characterize  the  crude  lifetimes,  which  are  then 
used  to  characterize  the  net  lifetime  of  interest.  Our  recommendation  when  faced  with  performing 
reliability  analysis  on  a  set  of  randomly  censored  data  is  to  attempt  the  simultaneous  test  of  crude 
lifetimes  first.  If  no  parametric  model  will  adequately  characterize  the  set  of  withdrawals,  then 


91 


the  semi-parametric  crude  life  procedure  should  be  used  to  estimate  the  survivor  function  of  the 
variable  of  interest. 


92 


Appendix  A,  Numerical  Integration  with  Simpson’s  Rule 

Simpson’s  Rule  is  a  method  for  evaluating  a  definite  integral  f{^)dx  using  a  parabolic  approxi¬ 
mation.  Simpson’s  Rule  is  defined  as  [9] 


L 


*>  Aa; 

f{x)dx  =  +  4/(x2)  +  2f{xz)  +  . . .  +  4/(x„)  +  fi^n+i)]— 


where  Xi, . . . ,  Xn+i  are  values  with  equal  spacing  Ax  =  such  that  Xi  =  a,  x„+i  =  b,  and  n  is 
an  even  integer.  Note  that  the  function  evaluations  are  weighted  such  that  the  first  and  last  weights 
are  1  while  the  weights  for  the  middle  terms  are  alternating  4’s  and  2’s  beginning  and  ending  with 
4. 


93 


Appendix  B.  An  Interesting  Net  Lifetime  Result  from  Tests  Based  on  Crude 

Lifetimes 

An  interesting  phenomenon  results  when  the  cumulative  hazard  function  of  the  crude  lifetime  of 
the  censoring  variable  is  less  than  the  cumulative  hazard  function  of  the  crude  lifetime  of  the 
variable  of  interest,  namely,  when  Hyc  {t)  <  jffyr(i).  The  resulting  net  lifetime  is  known  as  a  split 
population  model,  or  just  split  model,  and  has  been  studied  by  Schmidt  and  Witte  [118,119].  The 
analytical  foundations  of  this  phenomenon  surface  in  the  following  way.  The  hazard  function  for  the 
underlying  net  lifetime  of  the  random  variable  of  interest,  T,  is  determined  through  the  relationship 


hrit)  = 


PfYrit) _ 

pSyt  (t)  +  (1  -  p)Syc  (0 


(28) 


where  Yr  is  the  crude  lifetime  of  the  variable  of  interest  and  Yc  is  the  crude  lifetime  of  the  censoring 
random  variable.  Equation  28  can  be  expressed  as 


hrit)  = 


pHyt  {t)SYT  (i) 
p5yj,(<)  +  (l  -p)SYcit) 


which,  in  turn,  can  be  written  as 


hrit)  = 


phYxit)^ 


and  finally 


hrit)  = 


hYxit) 


1  4.  q-HycW+HyxW  ■ 


(29) 


Now,  as  t  ^  00,  the  exponential  term  in  the  denominator  will  dominate  the  expression  and  if 
HycW  <  limi^oohT(0  <  00*  Consequently,  Wmt^oo  Frii)  <  1*  The  dominance  of 

the  exponential  term  in  the  denominator  is  affected  by  the  proportion  of  observed  failures  in  the 
sample,  p,  as  well.  As  p  gets  closer  to  1,  limt_>oo  Frit)  will  also  get  closer  to  1.  An  example  of  a 


94 


net  lifetime  cumulative  distribution  function  is  shown  in  Figure  12  to  illustrate.  In  the  example, 
the  crude  censoring  variable  follows  a  Weibull  distribution  with  shape  2.5  and  scale  25  while  the 
distribution  of  the  crude  lifetime  of  the  variable  of  interest  is  Weibull  with  shape  2  and  scale  15. 
Figure  12  makes  it  clear  that  only  about  68%  of  the  population  would  fail  from  the  underlying  risk 
of  interest  if  it  were  the  only  risk  acting  on  the  population  (while  the  remaining  members  of  the 
population  live  forever).  Although  this  does  not  meet  the  criterion  of  what  we  usually  consider  to 
be  a  “legitimate”  lifetime  distribution,  a  plausable  explanation  for  this  phenomenon  may  be  that 
a  certain  proportion  of  the  population  will  not  fail  as  a  result  of  the  underlying  variable  of  interest 
when  that  is  the  only  risk  present.  Consider  a  medical  follow-up  study  in  which  there  is  a  randomly 
censored  sample  consisting  of  observed  times  of  death  and  withdrawal  times.  It  is  possible  that 
if  we  were  able  to  observe  all  of  the  original  subjects  in  the  sample  without  any  censoring  that  a 
certain  proportion  of  them  would  never  die  of  cancer.  In  essence,  for  all  practical  purposes  the 
resulting  distributon  and  survivor  functions  yield  useful  results  and  should  be  used  in  reliability 
and  survival  analysis. 


95 


Appendix  C,  Plots  Illustrating  Estimation  Techniques  for  Randomly  Censored  Data 

Plots  illustrating  examples  of  all  of  the  distribution  function  techniques,  with  the  exception  of 
the  3-parameter  Weibull,  are  presented  in  this  Appendix.  Each  estimator  was  used  to  find  a 
distribution  funtion  estimate  for  the  exponential,  Weibull  (with  shape  2),  and  Weibull  (with  shape 
3.5)  disributions.  With  each  underlying  distribution,  an  exponentially  distributed  random  censoring 
variable  was  used  and  the  scale  parameter  was  adjusted  to  provide  the  desired  expected  proportion 
of  censoring.  The  plots  demonstrate  the  effectiveness  of  each  estimator  when  the  sample  is  expected 
to  be  25%,  50%,  and  75%  censored.  Samples  of  size  40  were  used  in  each  case. 


97 


Cumulative  Distribution  Function  Cumulative  Distribution  Function 


Lifetime  Distribution:  Exponential  (50),  Sample  Size  n=40 


re  13 


Maximum  Likelihood  Estimator,  Exponential  Distribution. 


Lifetime  Distribution:  Weibull  (2,  50),  Sample  Size  n=40 


Figure  14  Maximum  Likelihood  Estimator,  Weibull  with  shape  2 


Cumulative  Distribution  Function  Cumulative  Distribution  Function 


Lifetime  Distribution:  Weibull  (3.5, 50),  Sample  Size  n=40 


ire  15 


Maximum  Likelihood  Estimator,  Weibull  with  shape  3.5. 


Lifetime  Distribution:  Exponential  (50),  Sample  Size  n=40 


Figure  16  Kaplan-Meier  Estimator,  Exponential  Distribution. 


99 


Cumulative  Distribution  Function  Cuimilative  Distribution  Function 


Lifetime  Distribution:  Weibull  (2,  50),  Sample  Size  n=40 


Figure  17  Kaplan-Meier  Estimator,  Weibull  with  shape  2. 


Lifetime  Distribution:  Weibull  (3.5.  50),  Sample  Size  n=40 


Figure  18  Kaplan-Meier  Estimator,  Weibull  with  shape  3.5. 


100 


Lifetime  Distribution:  Exponential  (50),  Sample  Size  0=40 


Figure  19  Mean  Order  Number  Estimator,  Exponential  Distribution. 


Lifetime  Distribution:  Weibull  (2,  50),  Sample  Size  n=40 


Figure  20  Mean  Order  Number  Estimator,  Weibull  with  shape  2. 


101 


Lifetime  Distribution:  Weibull  (3.5, 50),  Sample  Size  n=40 


Figure  21  Mean  Order  Number  Estimator,  Weibull  with  shape  3.5. 


Lifetime  Distribution:  Exponential  (50),  Sample  Size  ns40 


Figure  22 


Piecewise  Exponential  Estimator,  Exponential  Distribution. 


102 


Lifetime  Distribution:  Weibull  (2,  50),  Sample  Size  n=40 


Figure  23  Piecewise  Exponential  Estimator,  Weibull  with  shape  2. 


Lifetime  Distribution:  Weibull  {3.5,  50),  Sample  Size  n=40 


Figure  24 


Piecewise  Exponential  Estimator,  Weibull  with  shape  3.5. 


103 


Lifetime  Distribution:  Exponential  (50),  Sample  Size  n=40 


Figure  25  Blum-Susarla  Kernel  Estimator,  Exponential  Distribution. 


Lifetime  Distribution:  Weibull  (2.  50),  Sample  Size  n=40 


Figure  26  Blum-Susarla  Kernel  Estimator,  Weibull  with  shape  2. 


104 


Lifetime  Distribution:  Weibul!  (3.5, 50),  Sample  Size  ns40 


Figure  27  Blum-Susarla  Kernel  Estimator,  Weibull  with  shape  3.5. 


Lifetime  Distribution:  Exponential  (50),  Sample  Size  n=40 


Figure  28  Foldes,  Rejto,  and  Winter  Kernel  Estimator,  Exponential  Distribution. 


105 


106 


Lifetime  Distribution:  Exponential  (50),  Sample  Size  n=40 


Figure  31  Trigonometrically-Smoothed  KME,  Exponential  Distribution. 


Lifetime  Distribution:  Weibul)  (2,  50),  Sample  Size  n=40 


Figure  32  Trigonometrically-Smoothed  KME,  Weibull  with  shape  2. 


107 


Lifetime  Distribution:  Weibutl  (3.5, 50).  Sample  Size  n=40 


Figure  33  Trigonometrically-Smoothed  KME,  Weibull  with  shape  3.5. 


Lifetime  Distribution:  Exponential  (50),  Sample  Size  n=40 


Figure  34  TrigonometricaJly-Smoothed  and  Jackknifed  KME,  Exponential  Distribution. 


108 


Lifetime  Distribution:  Weibull  (2,  50),  Sample  Size  n=40 


Figure  35  Trigonometrically-Smoothed  and  Jackknifed  KME,  Weibull  with  shape  2. 


Lifetime  Distribution:  Weibull  (3.5, 50).  Sample  Size  n=40 


Figure  36  Trigonometrically-Smoothed  and  Jackknifed  KME,  Weibull  with  shape  3.5. 


109 


Ltfetimo  Distribution:  Exponential  (50),  Sample  Size  0=40 


Figure  37 


Klein,  Lee,  and  Moeschberger  Estimator,  Exponential  Distribution. 


Lifetime  Distribution:  Weibuii  (2,  50),  Sample  Size  n=40 


Figure  38  Klein,  Lee,  and  Moeschberger  Estimator,  Weibull  with  shape  2. 


no 


Lifetime  Distribution:  Weibull  (3.5, 50),  Sample  Size  n=40 


Figure  39  Klein,  Lee,  and  Moeschberger  Estimator,  Weibull  with  shape  3.5. 


Lifetime  Distribution:  Exponential  (50),  Sample  Size  ns40 


Figure  40  Semi-Paxametric  Kaplan-Meier  Estimator,  Exponential  Distribution. 


Ill 


Lifetime  Distribution:  Weibull  (2,  50),  Sample  Size  n=40 


Figure  41  Semi-Parametric  Kaplan-Meier  Estimator,  Weibull  with  shape  2. 


Lifetime  Distribution:  Weibull  (3.5, 50),  Sample  Size  ns40 


Figure  42  Semi-Parametric  Kaplan-Meier  Estimator,  Weibull  with  shape  3.5. 


112 


Appendix  D.  Percentage  Points  for  New  Tests  for  the  Exponential  Distribution 

The  following  tables  of  percentage  points  are  to  be  used  with  the  new  KME-modified  Cramer- von 
Mises  and  Anderson-Darling  goodness-of-fit  tests  for  exponentiality  presented  in  Section  3.5. 


113 


Table  20  Percentage  Points  of  for  the  Exponential  Distribution  with  Exponential  Censoring. 


n 

0.25 

0.20 

a 

0.15 

0.10 

0.05 

0.025 

1! 

o 

0.115 

0.129 

0.146 

0.171 

0.214 

0.258 

40 

0.117 

0.130 

0.147 

0.172 

0.217 

0.262 

60 

0.117 

0.131 

0.148 

0.173 

0.218 

0.263 

80 

0.117 

0.131 

0.148 

0.173 

0.218 

0.264 

100 

0.117 

0.131 

0.148 

0.174 

0.219 

0.265 

120 

0.117 

0.131 

0.149 

0.174 

0.219 

0.265 

140 

0.118 

0.131 

0.149 

0.175 

0.219 

0.265 

160 

0.118 

0.131 

0.149 

0.175 

0.219 

0.265 

180 

0.118 

0.131 

0.149 

0.175 

0.220 

0.266 

200 

0.118 

0.132 

0.149 

0.175 

0.220 

0.266 

II 

io 

o 

El 

0.117 

0.130 

0.147 

0.171 

0.213 

0.256 

0.120 

0.133 

0.150 

0.175 

0.218 

0.261 

HI 

0.121 

0.134 

0.151 

0.176 

0,220 

0.264 

80 

0.121 

0.134 

0.151 

0.176 

0.220 

0.264 

100 

0.121 

0.135 

0.151 

0.176 

0.221 

0.265 

120 

0.122 

0.135 

0.152 

0.177 

0.221 

0.266 

140 

0.122 

0.135 

0.153 

0.177 

0.222 

0.267 

160 

0.122 

0.135 

0.153 

0.178 

0.222 

0.268 

180 

0.122 

0.135 

0.153 

0.178 

0.222 

0.268 

0.122 

0.136 

0.153 

0.178 

0.223 

0.268 

q  =  .30 

20 

0.119 

0.132 

0.149 

0.173 

0.215 

0.258 

40 

0.126 

0.139 

0.156 

0.181 

0.224 

0.269 

60 

0.128 

0.141 

0.159 

0.185 

0.228 

0.273 

80 

0.129 

0.142 

0.160 

0.185 

0.229 

0.275 

100 

0.129 

0.143 

0.161 

0.186 

0.230 

0.277 

120 

0.130 

0.144 

0.161 

0.187 

0.231 

0.278 

140 

0.130 

0.144 

0.162 

0.188 

0.232 

0.279 

160 

0.131 

0.144 

0.162 

0.188 

0.232 

0.279 

180 

0.131 

0.145 

0.163 

0.188 

0.233 

0.280 

200 

0.131 

0.145 

0.163 

0.188 

0.233 

0.280 

114 


Table  21  Percentage  Points  of  for  the  Exponential  Distribution  with  Exponential  Censoring 
(cont’d), _ 


n 

0.25 

0.20 

a 

0.15  0.10 

0.05 

0.025 

o 

II 

20 

0.121 

0.134 

0.152 

0.177 

0.222 

0.269 

40 

0.133 

0.148 

0.167 

0.193 

0.240 

0.290 

60 

0.139 

0.154 

0.174 

0.201 

0.248 

0.299 

80 

0.142 

0.157 

0.176 

0.204 

0.252 

0.303 

100 

0.144 

0.159 

0.179 

0.207 

0.256 

0.307 

120 

0.146 

0.161 

0.181 

0.210 

0.258 

0.308 

140 

0.147 

0.162 

0.182 

0.211 

0.260 

0.310 

160 

0.148 

0.163 

0.183 

0.211 

0.261 

0.312 

180 

0.149 

0.165 

0.184 

0.212 

0.262 

0.313 

200 

0.149 

0.165 

0.186 

0.213 

0.263 

0.314 

q  =  .50 

20 

0.120 

0.134 

0.153 

0.181 

0.231 

0.287 

40 

0.142 

0.158 

0.180 

0.211 

0.267 

0.330 

60 

0.154 

0.171 

0.194 

0.228 

0.288 

0.352 

80 

0.162 

0.180 

0.204 

0.238 

0.299 

0.366 

100 

0.168 

0.187 

0.211 

0.247 

0.310 

0.376 

120 

0.173 

0.192 

0.217 

0.252 

0.316 

0.384 

140 

0.177 

0.196 

0.221 

0.257 

0.322 

0.390 

160 

0.180 

0.199 

0.225 

0.261 

0.326 

0.394 

180 

0.182 

0.201 

0.227 

0.263 

0.329 

0.398 

200 

0.184 

0.204 

0.229 

0.266 

0.331 

0.401 

o 

(;q 

II 

20 

0.112 

0.128 

0.149 

0.180 

0.240 

0.306 

40 

0.146 

0.165 

0.191 

0.230 

0.306 

0.394 

60 

0.168 

0.189 

0.219 

0.264 

0.349 

0.447 

80 

0.184 

0.208 

0.239 

0.286 

0.376 

0.480 

100 

0.198 

0.222 

0.256 

0.305 

0.400 

0.513 

120 

0.209 

0.235 

0.270 

0.322 

0.422 

0.536 

140 

0.217 

0.245 

0.281 

0.335 

0.437 

0.555 

160 

0.226 

0.254 

0.291 

0.345 

0.450 

0.571 

180 

0.232 

0.262 

0.299 

0.356 

0.462 

0.585 

200 

0.239 

0.267 

0.305 

0.363 

0.472 

0.596 

115 


Table  22  Percentage  Points  of  for  the  Exponential  Distribution  with  Exponential  Censor- 
ing(cont’d). _ 


n 

0.25 

0.20 

0.15 

a 

0.10 

0.05 

0.025 

II 

20 

0.094 

0.111 

0.133 

0.166 

0.233 

0.312 

40 

0.133 

0.155 

0.185 

0.233 

0.327 

0.441 

60 

0.165 

0.191 

0.228 

0.284 

0.400 

0.543 

80 

0.191 

0.221 

0.263 

0.329 

0.462 

0.626 

100 

0.214 

0.248 

0.295 

0.367 

0.513 

0.689 

120 

0.235 

0.272 

0.323 

0.402 

0.560 

0.755 

140 

0.253 

0.292 

0.346 

0.429 

0.596 

0.805 

160 

0.271 

0.312 

0.370 

0.457 

0.633 

0.851 

180 

0.285 

0.329 

0.388 

0.481 

0.668 

0.891 

200 

0.298 

0.344 

0.406 

0.502 

0.692 

0.924 

o 

oq 

II 

20 

0.061 

0.076 

0.098 

0.130 

0.195 

0.276 

40 

0.089 

0.109 

0.140 

0.187 

0.283 

0.409 

60 

0.119 

0.146 

0.185 

0.247 

0.377 

0.550 

80 

0.148 

0.180 

0.228 

0.303 

0.461 

0.671 

100 

0.175 

0.213 

0.267 

0.355 

0.539 

0.777 

120 

0.200 

0.243 

0.304 

0.401 

0.608 

0.878 

140 

0.224 

0.271 

0.338 

0.448 

0.674 

0.966 

160 

0.247 

0.299 

0.371 

0.487 

0.734 

1.057 

180 

0.269 

0.324 

0.402 

0.526 

0.789 

1.127 

200 

0.290 

0.348 

0.430 

0.564 

0.851 

1.203 

9  =  .90 

20 

0.028 

0.038 

0.052 

0.076 

0.131 

0.195 

40 

0.027 

0.036 

0.051 

0.079 

0.141 

0.233 

60 

0.035 

0.046 

0.065 

0.099 

0.182 

0.299 

80 

0.045 

0.059 

0.083 

0.126 

0.228 

0.374 

100 

0.055 

0.073 

0.101 

0.152 

0.271 

0.445 

120 

0.066 

0.087 

0.120 

0.178 

0.317 

0.513 

140 

0.077 

0.101 

0.139 

0.206 

0.365 

0.588 

160 

0.088 

0.115 

0.157 

0.232 

0.410 

0.656 

180 

0.099 

0.129 

0.176 

0.259 

0.456 

0.734 

200 

0.111 

0.143 

0.194 

0.285 

0.496 

0.795 

116 


Table  23  Percentage  Points  of  for  the  Elxponential  Distribution  with  Exponential  Censoring. 


n 

0.25 

0.20 

a 

0.15 

0.10 

0.05 

0.025 

q  =  .10 

20 

0.709 

0.784 

0.882 

1.024 

1.274 

1.542 

40 

0.733 

0.811 

0.911 

1.053 

1.310 

1.571 

60 

0.745 

0.822 

0.923 

1.069 

1.323 

1.593 

80 

0.750 

0.827 

0.926 

1.071 

1.329 

1.591 

100 

0.753 

0.832 

0.932 

1.078 

1.336 

1.603 

120 

0.759 

0.836 

0.937 

1.083 

1.340 

1.608 

140 

0.762 

0.840 

0.941 

1.086 

1.341 

1.610 

160 

0.761 

0.840 

0.941 

1.089 

1.343 

1.612 

180 

0.763 

0.841 

0.944 

1.091 

1.348 

1.615 

200 

0.765 

0.843 

0.945 

1.091 

1.348 

1.609 

q  =  .20 

20 

0.722 

0.800 

0.901 

1.044 

1.309 

1.587 

40 

0.765 

0.844 

0.948 

1.099 

1.364 

1.650 

60 

0.785 

0.865 

0.970 

1.119 

1.389 

1.668 

80 

0.795 

0.875 

0.980 

1.131 

1.399 

1.674 

100 

0.803 

0.884 

0.989 

1.140 

1.408 

1.685 

120 

0.811 

0.893 

0.997 

1.149 

1.419 

1.692 

140 

0.815 

0.896 

1.002 

1.154 

1.421 

1.706 

160 

0.819 

0.900 

1.007 

1.158 

1.427 

1.708 

180 

0.822 

0.904 

1.010 

1.162 

1.430 

1.705 

200 

0.824 

0.905 

1.014 

1.166 

1.433 

1.708 

9  =  .30 

20 

0.730 

0.808 

0.913 

1.065 

1.341 

1.642 

40 

0.799 

0.883 

0.995 

1.156 

1.451 

1.782 

60 

0.834 

0.923 

1.038 

1.205 

1.504 

1.838 

80 

0.857 

0.946 

1.061 

1.228 

1.534 

1.861 

100 

0.875 

0.964 

1.082 

1.251 

1.558 

1.894 

120 

0.889 

0.979 

1.097 

1.267 

1.574 

1.906 

140 

0.900 

0.993 

1.112 

1.284 

1.595 

1.927 

160 

0.910 

1.001 

1.120 

1.291 

1.601 

1.933 

180 

0.918 

1.011 

1.131 

1.303 

1.614 

1.940 

200 

0.922 

1.015 

1.135 

1.309 

1.620 

1.949 

117 


Table  24  Percentage  Points  of  ^  for  the  Exponential  Distribution  with  Exponential  Censoring 
(cont’d), _ 


n 

0.25 

0.20 

0.15 

a 

0.10 

0.05 

0.025 

o 

II 

20 

0.725 

0.807 

0.915 

1.075 

1.374 

1.729 

40 

0.829 

0.922 

1.043 

1.225 

1.572 

1.978 

60 

0.892 

0.991 

1.121 

1.317 

1.682 

2.109 

80 

0.932 

1.033 

1.170 

1.370 

1.750 

2.191 

100 

0.966 

1.070 

1.210 

1.416 

1.804 

2.254 

120 

0.994 

1.101 

1.244 

1.456 

1.845 

2.312 

140 

1.016 

1.125 

1.268 

1.482 

1.885 

2.343 

160 

1.035 

1.146 

1.291 

1.508 

1.916 

2.375 

180 

1.052 

1.164 

1.314 

1.531 

1.946 

2.415 

200 

1.064 

1.177 

1.327 

1.547 

1.966 

2.432 

o 

II 

20 

0.701 

0.784 

0.895 

1.060 

1.386 

1.785 

40 

0.842 

0.941 

1.076 

1.282 

1.689 

2.195 

60 

0.933 

1.045 

1.196 

1.428 

1.881 

2.464 

80 

1.002 

1.121 

1.284 

1.527 

2.027 

2.661 

100 

1.062 

1.187 

1.360 

1.620 

2.139 

2.803 

120 

1.112 

1.243 

1.423 

1.695 

2.248 

2.927 

140 

1.150 

1.288 

1.472 

1.759 

2.332 

3.035 

160 

1.190 

1.329 

1.523 

1.815 

2.392 

3.119 

180 

1.223 

1.366 

1.561 

1.860 

2.447 

3.199 

200 

1.246 

1.395 

1.596 

1.903 

2.516 

3.285 

q=.60 

20 

0.644 

0.727 

0.839 

1.009 

1.351 

1.771 

40 

0.817 

0.922 

1.068 

1.294 

1.763 

2.379 

60 

0.938 

1.064 

1.236 

1.502 

2.069 

2.812 

80 

1.039 

1.176 

1.365 

1.662 

2.292 

3.147 

100 

1.128 

1.276 

1.481 

1.808 

2.510 

3.451 

120 

1.201 

1.362 

1.588 

1.941 

2.693 

3.703 

140 

1.268 

1.438 

1.675 

2.046 

2.848 

3.918 

160 

1.330 

1.510 

1.757 

2.147 

2.978 

4.103 

180 

1.386 

1.574 

1.835 

2.243 

3.114 

4.310 

200 

1.435 

1.625 

1.891 

2.313 

3.228 

4.455 

118 


Table  25  Percentage  Points  of  for  the  Exponential  Distribution  with  Exponential  Censoring 
(cont’d). _ 


n 

0.25 

0.20 

0.15 

a 

0.10 

0.05 

0.025 

II 

20 

0.544 

0.622 

0.728 

0.889 

1.218 

1.634 

40 

0.717 

0.824 

0.969 

1.206 

1.685 

2.316 

60 

0.862 

0.992 

1.172 

1.457 

2.068 

2.892 

80 

0.986 

1.135 

1.345 

1.681 

2.407 

3.398 

100 

1.099 

1.268 

1.506 

1.879 

2.702 

3.790 

120 

1.204 

1.387 

1.652 

2.078 

2.988 

4.240 

140 

1.295 

1.494 

1.774 

2.229 

3.201 

4.565 

160 

1.386 

1.600 

1.908 

2.389 

3.441 

4.890 

180 

1.464 

1.694 

2.016 

2.533 

3.660 

5.211 

200 

1.539 

1.780 

2.118 

2.664 

3.856 

5.462 

o 

oq 

!! 

20 

0.393 

0.463 

0.557 

0.703 

0.994 

1.351 

40 

0.518 

0.610 

0.738 

0.940 

1.369 

1.927 

60 

0.644 

0.758 

0.926 

1.195 

1.763 

2.546 

80 

0.764 

0.902 

1.105 

1.428 

2.120 

3.067 

100 

0.876 

1.039 

1.272 

1.652 

2.466 

3.563 

120 

0.980 

1.163 

1.424 

1.846 

2.767 

4.009 

140 

1.077 

1.280 

1.569 

2.049 

3.061 

4.424 

160 

1.174 

1.395 

1.708 

2.221 

3.325 

4.846 

180 

1.266 

1.502 

1.840 

2.387 

3.577 

5.182 

200 

1.353 

1.606 

1.964 

2.556 

3.851 

5.556 

9  =  .90 

20 

0.243 

0.292 

0.363 

0.476 

0.709 

0.990 

40 

0.241 

0.294 

0.370 

0.499 

0.782 

1.171 

60 

0.285 

0.347 

0.442 

0.602 

0.969 

1.464 

80 

0.337 

0.412 

0.527 

0.724 

1.177 

1.795 

100 

0.392 

0.480 

0.615 

0.848 

1.371 

2.097 

120 

0.446 

0.547 

0.702 

0.964 

1.571 

2.386 

140 

0.498 

0.613 

0.788 

1.085 

1.780 

2.713 

160 

0.551 

0.677 

0.870 

1.200 

1.968 

2.995 

180 

0.602 

0.742 

0.956 

1.319 

2.161 

3.321 

200 

0.656 

0.808 

1.036 

1.432 

2.331 

3.573 

119 


Appendix  E,  Percentage  Points  for  New  Tests  for  the  Weihull  Distribution 


The  following  tables  of  percentage  points  are  to  be  used  with  the  new  KME-modified  Cramer- von 
Mises  and  Anderson-Darling  goodness-of-fit  tests  for  the  Weibull  distribution  with  shape  parameters 
2  and  3.5  presented  in  Section  3.6. 


120 


Table  26 


Percentage  Points  of  the  Weibull  Distribution  with  Shape  0  =  2  and  Expo- 

nential  Censoring. _ 


n 

0.25 

0.20 

a 

0.15  0.10 

0.05 

0.025 

II 

o 

20 

0.116 

0.128 

0.143 

0.165 

0.201 

0.237 

40 

0.115 

0.127 

0.143 

0.164 

0.202 

0.238 

60 

0.114 

0.126 

0.141 

0.163 

0.200 

0.238 

80 

0.113 

0.125 

0.140 

0.162 

0.199 

0.236 

100 

0.112 

0.124 

0.139 

0.160 

0.197 

0.235 

g  =  .20 

20 

0.121 

0.133 

0.149 

0.171 

0.208 

0.245 

40 

0.119 

0.131 

0.147 

0.168 

0.206 

0.243 

60 

0.117 

0.129 

0.145 

0.167 

0.204 

0.241 

80 

0.116 

0.128 

0.143 

0.165 

0.202 

0.239 

100 

0.115 

0.127 

0.142 

0.164 

0.201 

0.238 

20 

0.130 

0.143 

0.160 

0.184 

0.226 

0.266 

40 

0.129 

0.142 

0.159 

0.182 

0.222 

0.262 

60 

0.128 

0.141 

0.157 

0.181 

0.220 

0.261 

80 

0.127 

0.140 

0.156 

0.180 

0.219 

0.260 

100 

0.126 

0.139 

0.156 

0.179 

0.218 

0.260 

O 

II 

20 

0.144 

0.159 

0.179 

0.207 

0.255 

0.305 

40 

0.146 

0.161 

0.180 

0.207 

0.253 

0.300 

60 

0.146 

0.161 

0.180 

0.206 

0.253 

0.299 

80 

0.146 

0.160 

0.179 

0.205 

0.250 

0.296 

100 

0.145 

0.160 

0.179 

0.205 

0.251 

0.298 

g  =  .50 

20 

0.163 

0.182 

0.205 

0.239 

0.301 

0.368 

40 

0.172 

0.190 

0.213 

0.246 

0.304 

0.365 

60 

0.174 

0.192 

0.215 

0.248 

0.305 

0.364 

80 

0.175 

0.192 

0.215 

0.248 

0.305 

0.361 

100 

0.175 

0.193 

0.216 

0.248 

0.304 

0.360 

O 

II 

20 

0.185 

0.208 

0.239 

0.284 

0.369 

0.469 

40 

0.209 

0.232 

0.263 

0.307 

0.389 

0.478 

60 

0.218 

0.242 

0.273 

0.318 

0.397 

0.481 

80 

0.223 

0.247 

0.277 

0.322 

0.399 

0.481 

100 

0.227 

0.250 

0.281 

0.325 

0.401 

0.479 

11 

o 

20 

0.202 

0.233 

0.273 

0.336 

0.465 

0.631 

40 

0.254 

0.288 

0.334 

0.403 

0.539 

0.705 

60 

0.285 

0.321 

0.369 

0.441 

0.576 

0.737 

80 

0.305 

0.341 

0.390 

0.461 

0.596 

0.751 

100 

0.318 

0.355 

0.404 

0.476 

0.612 

0.760 

o 

oq 

II 

20 

0.199 

0.237 

0.294 

0.384 

0.578 

0.833 

40 

0.281 

0.332 

0.403 

0.517 

0.770 

1.104 

60 

.356 

0.413 

0.495 

0.629 

0.908 

1.274 

80 

0.413 

0.478 

0.567 

0.710 

1.002 

1.384 

100 

0.461 

0.531 

0.624 

0.772 

1.077 

1.474 

9  =  .90 

20 

0.157 

0.204 

0.270 

0.388 

0.676 

1.052 

40 

0.173 

0.233 

0.322 

0.475 

0.858 

1.440 

60 

0.253 

0.328 

0.442 

0.645 

1.153 

1.927 

80 

0.350 

0.444 

0.587 

0.839 

1.450 

2.374 

100 

0.453 

0.566 

0.736 

1.035 

1.745 

2.787 

121 


Table  27 


Percentage  Points  of  VF^.n  ^^e  Weibull  Distribution  with  Shape  ^  =  3.5  and  Expo- 
nential  Censoring. _ 


n 

0.25 

0.20 

0.15 

a 

0.10 

0.05 

0.025 

9  =  .10 

20 

0.214 

0.245 

0.283 

0.338 

0.426 

0.509 

40 

0.199 

0.230 

0.271 

0.331 

0.434 

0.535 

60 

0.184 

0.213 

0.253 

0.312 

0.417 

0.522 

80 

0.171 

0.197 

0.233 

0.289 

0.391 

0.500 

100 

0.162 

0.187 

0.220 

0.271 

0.366 

0.472 

o 

II 

20 

0.169 

0.191 

0.221 

0.263 

0.337 

0.413 

40 

0.152 

0.171 

0.197 

0.234 

0.302 

0.377 

60 

0.144 

0.161 

0.184 

0.218 

0.279 

0.345 

80 

0.139 

0.155 

0.176 

0.207 

0.263 

0.323 

100 

0.136 

0.151 

0.171 

0.201 

0.253 

0.309 

o 

CO 

II 

20 

0.162 

0.180 

0.205 

0.240 

0.301 

0.366 

40 

0.151 

0.168 

0.190 

0.220 

0.276 

0.333 

60 

0.146 

0.162 

0.183 

0.212 

0.264 

0.318 

80 

0.144 

0.159 

0.179 

0.208 

0.258 

0.309 

100 

0.142 

0.157 

0.177 

0.205 

0.254 

0.304 

O 

II 

20 

0.171 

0.190 

0.214 

0.248 

0.310 

0.371 

40 

0.165 

0.183 

0.205 

0.237 

0.293 

0.350 

60 

0.162 

0.179 

0.201 

0.233 

0.288 

0.344 

80 

0.161 

0.177 

0.199 

0.230 

0.284 

0.339 

100 

0.159 

0.176 

0.198 

0.229 

0.282 

0.337 

O 

II 

20 

0.192 

0.214 

0.241 

0.281 

0.351 

0.422 

40 

0.191 

0.212 

0.238 

0.275 

0.339 

0.404 

60 

0.190 

0.210 

0.236 

0.273 

0.336 

0.401 

80 

0.189 

0.209 

0.234 

0.270 

0.334 

0.397 

100 

0.189 

0.208 

0.234 

0.270 

0.331 

0.395 

9  =  .60 

20 

0.226 

0.252 

0.287 

0.336 

0.426 

0.522 

40 

0.233 

0.259 

0.291 

0.339 

0.420 

0.503 

60 

0.235 

0.260 

0.292 

0.337 

0.416 

0.497 

80 

0.235 

0.260 

0.291 

0.336 

0.414 

0.494 

100 

0.235 

0.260 

0.291 

0.336 

0.414 

0.490 

II 

20 

0.272 

0.307 

0.354 

0.424 

0.556 

0.714 

40 

0.299 

0.334 

0.379 

0.445 

0.566 

0.701 

60 

0.310 

0.344 

0.389 

0.453 

0.568 

0.690 

80 

0.315 

0.349 

0.394 

0.457 

0.568 

0.683 

100 

0.318 

0.352 

0.396 

0.458 

0.562 

0.678 

o 

00 

II 

20 

0.334 

0.383 

0.451 

0.558 

0.773 

1.036 

40 

0.401 

0.456 

0.529 

0.641 

0.865 

1.142 

60 

0.445 

0.501 

0.576 

0.688 

0.900 

1.154 

80 

0.468 

0.524 

0.599 

0.708 

0.916 

1.157 

100 

0.485 

0.540 

0.615 

0.723 

0.922 

1.146 

*Ci 

II 

o 

20 

0.421 

0.497 

0.608 

0.786 

1.127 

1.506 

40 

0.523 

0.623 

0.767 

1.005 

1.528 

2.206 

60 

0.634 

0.748 

0.914 

1.177 

1.786 

2.582 

80 

0.737 

0.864 

1.044 

1.329 

1.976 

2.839 

100 

0.824 

0.956 

1.145 

1.445 

2.094 

2.940 

122 


Table  28 


Percentage  Points  of  ^4^  „  for  the  Weibull  Distribution  with  Shape  /3  =  2  and  Expo- 
nential  Censoring. _ 


n 

0.25 

0.20 

0.15 

a 

0.10 

0.05 

0.025 

9  =  .10 

20 

0.799 

0.866 

0.949 

1.062 

1.255 

1.442 

40 

0.793 

0.861 

0.949 

1.069 

1.267 

1.464 

60 

0.781 

0.851 

0.939 

1.061 

1.261 

1.466 

80 

0.774 

0.838 

0.926 

1.049 

1.255 

1.452 

100 

0.761 

0.831 

0.919 

1.039 

1.242 

1.441 

O 

II 

20 

0.807 

0.877 

0.967 

1.091 

1.301 

1.504 

40 

0.794 

0.865 

0.954 

1.076 

1.286 

1.496 

60 

0.782 

0.852 

0.941 

1.064 

1.270 

1.476 

80 

0.772 

0.841 

0.928 

1.048 

1.253 

1.457 

100 

0.765 

0.832 

0.918 

1.038 

1.243 

1.443 

II 

20 

0.842 

0.918 

1.019 

1.159 

1.404 

1.656 

40 

0.841 

0.918 

1.016 

1.154 

1.388 

1.626 

60 

0.837 

0.912 

1.009 

1.142 

1.374 

1.612 

80 

0.834 

0.909 

1.001 

1.134 

1.362 

1.591 

100 

0.830 

0.905 

0.997 

1.126 

1.355 

1.583 

II 

20 

0.903 

0.991 

1.107 

1.273 

1.582 

1.920 

40 

0.931 

1.021 

1.135 

1.298 

1.588 

1.908 

60 

0.943 

1.032 

1.146 

1.307 

1.594 

1.888 

80 

0.948 

1.036 

1.148 

1.306 

1.584 

1.873 

100 

0.952 

1.039 

1.151 

1.308 

1.584 

1.872 

q=.50 

20 

0.983 

1.089 

1.233 

1.443 

1.856 

2.356 

40 

1.063 

1.174 

1.318 

1.532 

1.930 

2.391 

60 

1.104 

1.216 

1.360 

1.573 

1.969 

2.413 

80 

1.123 

1.233 

1.380 

1.593 

1.979 

2.407 

100 

1.144 

1.254 

1.399 

1.608 

1.985 

2.395 

II 

20 

1.071 

1.203 

1.383 

1.661 

2.238 

2.989 

40 

1.241 

1.386 

1.579 

1.875 

2.478 

3.294 

60 

1.335 

1.486 

1.689 

1.995 

2.601 

3.382 

80 

1.394 

1.551 

1.757 

2.068 

2.670 

3.400 

100 

1.441 

1.599 

1.810 

2.124 

2.725 

3.429 

II 

20 

1.116 

1.282 

1.512 

1.885 

2.726 

3.907 

40 

1.429 

1.629 

1.912 

2.364 

3.379 

4.838 

60 

1.639 

1.863 

2.176 

2.677 

3.781 

5.296 

80 

1.791 

2.026 

2.346 

2.870 

4.008 

5.572 

100 

1.903 

2.144 

2.487 

3.035 

4.189 

5.733 

II 

20 

1.060 

1.256 

1.538 

2.015 

3.142 

4.706 

40 

1.473 

1.742 

2.127 

2.797 

4.456 

6.825 

60 

1.868 

2.189 

2.668 

3.492 

5.483 

8.400 

80 

2.192 

2.558 

3.102 

4.032 

6.271 

9.589 

100 

2.470 

2.875 

3.458 

4.478 

6.947 

10.508 

g  =  .90 

20 

0.848 

1.078 

1.383 

1.933 

3.325 

5.294 

40 

0.969 

1.235 

1.636 

2.355 

4.289 

7.503 

60 

1.343 

1.685 

2.200 

3.163 

5.788 

10.352 

80 

1.799 

2.223 

2.885 

4.094 

7.403 

13.014 

100 

2.280 

2.787 

3.594 

5.072 

9.019 

15.535 

123 


Table  29 


Percentage  Points  of  for  the  Weibull  Distribution  with  Shape  —  3.5  and  Expo- 
nential  Censoring. _ 


n 

0.25 

0.20 

0.15 

a 

0.10 

0.05 

0.025 

q  =  .10 

20 

1.956 

2.214 

2.523 

2.931 

3.550 

4.095 

40 

1.648 

1.920 

2.275 

2.772 

3.579 

4.328 

60 

1.414 

1.661 

2.007 

2.507 

3.368 

4.210 

80 

1.258 

1.473 

1.775 

2.231 

3.074 

3.932 

100 

1.163 

1.349 

1.612 

2.028 

2.809 

3.643 

II 

to 

o 

20 

1.343 

1.542 

1.804 

2.182 

2.808 

3.399 

40 

1.079 

1.222 

1.418 

1.719 

2.288 

2.910 

60 

0.982 

1.101 

1.260 

1.503 

1.967 

2.487 

80 

0.930 

1.035 

1.176 

1.383 

1.775 

2.213 

100 

0.901 

0.998 

1.128 

1.318 

1.664 

2.046 

q  =  .30 

20 

1.146 

1.288 

1.483 

1.773 

2.297 

2.842 

40 

1.006 

1.116 

1.262 

1.478 

1.862 

2.285 

60 

0.961 

1.060 

1.191 

1.377 

1.714 

2.063 

80 

0.939 

1.031 

1.151 

1.324 

1.633 

1.951 

100 

0.925 

1.015 

1.133 

1.298 

1.584 

1.893 

O 

II 

20 

1.128 

1.258 

1.682 

2.140 

2.627 

40 

1.065 

1.174 

1.316 

1.521 

1.880 

2.262 

60 

1.044 

1.146 

1.279 

1.473 

1.801 

2.150 

80 

1.034 

1.134 

1.263 

1.447 

1.761 

2.091 

100 

1.029 

1.128 

1.253 

1.431 

1.740 

2.060 

II 

O 

20 

1.209 

1.343 

1.524 

1.793 

2.279 

2.836 

40 

1.205 

1.330 

1.489 

1.721 

2.137 

2.583 

60 

1.204 

1.327 

1.482 

1.703 

2.098 

2.511 

80 

1.205 

1.322 

1.474 

1.694 

2.074 

2.463 

100 

1.210 

1.327 

1.474 

1.688 

2.055 

2.434 

^  =  .60 

20 

1.356 

1.515 

1.731 

2.053 

2.674 

3.437 

40 

1.431 

1.584 

1.788 

2.083 

2.637 

3.273 

60 

1.460 

1.615 

1.811 

2.098 

2.612 

3.183 

80 

1.478 

1.628 

1.825 

2.104 

2.610 

3.148 

100 

1.495 

1.643 

1.836 

2.114 

2.598 

3.122 

II 

20 

1.566 

1.770 

2.050 

2.482 

3.373 

4.601 

40 

1.764 

1.972 

2.253 

2.691 

3.570 

4.710 

60 

1.871 

2.087 

2.371 

2.791 

3.616 

4.617 

80 

1.934 

2.147 

2.430 

2.846 

3.635 

4.572 

100 

1.979 

2.188 

2.470 

2.880 

3.651 

4.519 

o 

00 

II 

20 

1.833 

2.102 

2.485 

3.106 

4.469 

6.233 

40 

2.248 

2.567 

3.009 

3.737 

5.339 

7.622 

60 

2.545 

2.887 

3.363 

4.124 

5.769 

8.046 

80 

2.723 

3.073 

3.562 

4.331 

5.972 

8.179 

100 

2.871 

3.226 

3.719 

4.490 

6.076 

8.210 

o 

II 

20 

2.207 

2.590 

3.161 

4.095 

6.109 

8.596 

40 

2.753 

3.263 

4.030 

5.363 

8.567 

12.979 

60 

3.333 

3.951 

4.877 

6.490 

10.494 

16.435 

80 

3.893 

4.593 

5.642 

7.513 

12.110 

18.985 

100 

4.382 

5.155 

6.2947 

8.308 

13.222 

20.517 

124 


Appendix  F.  Plots  and  Tables  of  Power  Study  Results  for  Tests  of  Exponentiality 


Table  30  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.10  and  q  =  0.20. 


Exponential  Distribution  with  Exponential  Censoring 


Alternative  Distribution 

Exponential 

Weibull 

shape=2 

Gamma 

shape=1.5 

Gamma 

shape=:2 

Lognormal 
from  N(0,1) 

17^20 

0.107 

WEM 

BBB 

■BB 

BEH 

■^r.n 

0.108 

BBI 

Bml 

mSm 

H^B 

0.108 

WmM 

B^l 

■H 

0.061 

WEm 

Bifl 

0.162 

0.042 

STCL-CvM 

0.078 

0.802 

BiB 

■SB 

0.155 

STCL-AD 

0.091 

HQoyjl 

0.188 

■■ 

0.157 

ir^4o 

0.084 

BQigBI 

0.832 

42 

0.087 

0.828 

52 

0.093 

0.644 

n(l-i^) 

0.062 

0.877 

0.411 

STCL-CvM 

0.076 

0.994 

^91 

0.736 

STCL-AD 

0.085 

0.993 

0.337 

0.743 

0.293 

0.552 

0.944 

0.549 

0.952 

HjRiS 

0.391 

0.878 

0.363 

n(l  -  Bi) 

»»!« 

0.266 

0.640 

0.104 

STCL-CvM 

0.104 

BniH 

0.458 

0.901 

0.335 

STCL-AD 

0.096 

1.000 

0.457 

0.907 

0.422 

IT^so 

0.100 

0.700 

■SB 

0.094 

0.704 

Bifl 

B2 

0.088 

0.576 

w^M 

0.510 

n(l  -  Rl) 

0.087 

0.369 

0.771 

0.148 

STCL-CvM 

0.099 

B99 

0.599 

0.975 

0.457 

STCL-AD 

0.117 

1.000 

0.600 

0.988 

0.587 

n  =  100 

jjnQ^II 

0.796 

^r’n 

0.809 

B^ 

0.708 

0.661 

nil-Rl) 

1.000 

0.432 

0.157 

STCL-CvM 

1.000 

0.705 

0.569 

STCL-AD 

1.000 

0.722 

9nBW 

0.731 

125 


Table  31  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.10  and  q  =  0.50. 

_ Exponential  Distribution  with  Exponential  Censoring _ 

I  Aiternative  Distribution 


Exponential 

Weibull 

shape=2 

Gamma 

shape=1.5 

Gamma 

shape=2 

Lognormal 
from  N(0,1) 

ir^20 

0.544 

■Bi 

■HH 

0.512 

0.132 

■H 

hibh 

0.280 

0.103 

0.044 

STCL-CvM 

0.530 

0.174 

0.284 

0.191 

STCL-AD 

0.096 

0.467 

0.141 

0.240 

0.157 

0.888 

0.876 

HiB 

0.853 

0.169 

0.464 

■Ih 

STCL-CvM 

0.076 

0.860 

0.240 

0.518 

■■■ 

STCL-AD 

0.089 

0.833 

0.221 

0.490 

0.329 

0.974 

■BSBI 

MMM 

0.264 

^r,n 

0.973 

0.258 

B^ 

0.982 

mBM 

0.765 

0.411 

STCL-CvM 

0.960 

0.306 

0.703 

0.434 

STCL-AD 

0.094 

0.957 

0.293 

0.699 

0.468 

0.109 

■QgggHIII 

0.837 

0.098 

0.836 

BIH 

# 

0.094 

0.914 

STCL-CvM 

0.081 

0.822 

0.586 

STCL-AD 

0.080 

0.994 

0.390 

0.836 

0.659 

n  =  ioo 

1.000 

0.492 

■Bi 

0.424 

1.000 

0.463 

0.448 

B^ 

1.000 

0.605 

0.780 

STCL-CvM 

1.000 

0.497 

0.701 

STCL-AD 

0.092 

1.000 

0.500 

0.779 

126 


Table  32  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.10  and  q  =  0.80. 


Exponential  Distribution  with  Exponential  Censoring 


Alternative  Distribution 

Exponential 

Weibull 

shape=2 

Gamma 

shape=1.5 

Gamma 

shape=2 

Lognormal 
from  N(0,1) 

n  =  20 

r,n 

WESSBM 

0.227 

0.113 

“^r.n 

mBSM 

0.127 

0.010 

0.002 

STCL-CvM 

BBH 

0.168 

0.176 

STCL-AD 

0.080 

0.132 

0.137 

n  =  40 

BSH 

0.205 

0.122 

An 

HSH 

0.245 

0.154 

■Bl 

0.021 

0.014 

STCL-CvM 

■BH 

0.336 

0.155 

0.245 

0.253 

STCL-AD 

0.275 

0.132 

0.202 

0.204 

n  =  60 

TjU 

jjlllQg^B 

0.697 

0.264 

0.690 

0.258 

0.185 

0.098 

0.108 

STCL-CvM 

0.492 

0.313 

0.364 

STCL-AD 

0.442 

0.149 

0.276 

0.338 

n  =  80 

0.622 

IjBSin 

0.179 

0.698 

^BETh 

0.262 

B^ 

0.381 

0.097 

0.262 

STCL-CvM 

0.646 

0.201 

0.424 

0.475 

STCL-AD 

IIIBEES^I 

0.600 

0.187 

0.406 

0.448 

n  =  100 

[||||||!|R^|H 

0.195 

MSBM 

0.199 

0.214 

BB 

0.269 

B^ 

0.099 

0.160 

WSM 

0.505 

STCL-CvM 

0.108 

0.745 

0.224 

0.504 

0.604 

STCL-AD 

0.097 

0.720 

0.208 

0.493 

0.592 

127 


Table  33  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.05  and  q  =  0.20. 


Exponential  Distribution  with  Exponential  Censoring 


Alternative  Distribution 

Exponential 

Weibull 

shape=2 

Gamma 

shape=1.5 

Gamma 

shape=2 

Lognormal 
from  N(0,1) 

IIQgSIB 

42 

0.049 

BBStB 

0.070 

n(l  -  Rl) 

0.027 

0.183 

0.057 

0.017 

STCL-CvM 

0.034 

0.680 

0.272 

0.085 

STCL-AD 

0.043 

0.629 

0.231 

0.091 

ir=4o 

BSEH 

0.990 

0.305 

0.727 

0.171 

BESH 

0.987 

0.265 

0.695 

0.187 

BSH 

0.959 

0.127 

0.473 

0.113 

n{l  -  Rl) 

■eh 

0.730 

0.087 

0.244 

STCL-CvM 

0.031 

0.977 

0.219 

0.604 

0.148 

STCL-AD 

0.033 

0.976 

0.213 

0.599 

0.177 

n  =  60 

0.409 

0.885 

“^r,n 

0.387 

0.892 

0.044 

0.247 

0.780 

0.213 

n(l  -  Rl) 

0.028 

0.144 

0.447 

0.054 

STCL-CvM 

0.057 

0.308 

0.798 

0.218 

STCL-AD 

0.055 

0.309 

0.831 

0.271 

n  =  80 

0.564 

0.965 

Ain 

BjR^B 

0.566 

0.978 

■HH 

b’^ 

0.040 

0.414 

0.939 

0.313 

n{l  -  Rl) 

0.036 

0.223 

0.610 

STCL-CvM 

0.063 

BE  SI 

0.442 

0.935 

STCL-AD 

0.063 

1.000 

0.473 

0.951 

0.417 

n  =  100 

BRiin 

MSSM 

0.455 

42’ 

BjRSB 

BIB 

WmM 

0.576 

£2 

BjBifl 

BB 

0.442 

n(l-J^) 

BjR^I 

0.281 

■Sh 

STCL-CvM 

0.042 

BjR^B 

0.568 

0.973 

STCL-AD 

0.042 

BtiiiiB 

0.595 

0.987 

128 


Table  34  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.05  and  q  =  0.50. 


Exponential  Distribution  with  Exponential  Censoring 


Alternative  Distribution 

Exponential 

Weibull 

shape=2 

Gamma 

shape=1.5 

Gamma 

shape=2 

Lognormal 
from  N(0,1) 

II 

to 

o 

Wn 

HSH 

0.382 

0.078 

■iiyH 

■EH 

0.306 

0.059 

■wHH 

0.089 

0.029 

STCL-CvM 

■EH 

0.374 

0.176 

0.106 

STCL-AD 

0.294 

0.140 

0.102 

o 

II 

' ''  r.n 

miaxgM 

0.344 

■IQrenillii 

^nn^H 

0.272 

0.081 

0.293 

STCL-CvM 

0.140 

0.357 

STCL-AD 

0.702 

0.120 

0.334 

0.180 

n  =  60 

Wn 

■EH 

0.178 

0.534 

0.127 

0.461 

HH 

B^ 

0.052 

■■ 

0.165 

0.598 

■mB 

STCL-CvM 

0.041 

0.917 

0.190 

0.559 

BlH 

STCL-AD 

0.048 

0.908 

0.546 

0.308 

n  =  80 

Tytl 

■EH 

0.244 

0.721 

•^ryU 

0.189 

0.652 

B^ 

0.046 

■■ 

0.319 

0.832 

0.427 

STCL-CvM 

0.042 

0.985 

0.269 

0.714 

0.427 

STCL-AD 

0.046 

0.985 

0.257 

0.720 

0.487 

o 

o 

II 

VP' 

'  ^  r.n 

1.000 

0.335 

i[img[Qiii 

42 

■^r,n 

0.998 

0.256 

B^ 

1.000 

0.443 

0.631 

STCL-CvM 

0.997 

0.364 

0.839 

0.557 

STCL-AD 

0.045 

0.998 

0.362 

0.862 

0.625 

129 


Table  35  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.05  and  q  =  0.80. 


Exponential  Distribution  with  Exponential  Censoring 


Alternative  Distribution 

Exponential 

Weibull 

shape=2 

Gamma 

shape=1.5 

Gamma 

shape=2 

Lognormal 
from  N(0,1) 

o 

CM 

II 

Wn 

0.063 

0.110 

^eseh 

0.063 

masm 

0.045 

0.000 

STCL-CvM 

0.045 

Hull 

0.080 

0.075 

STCL-AD 

0.056 

0.040 

0.051 

0.055 

71  =  AO 

r.n 

0.050 

^ess^h 

■^TyTl 

0.048 

0.046 

0.005 

0.000 

STCL-CvM 

0.039 

0.211 

0.078 

0.134 

0.135 

STCL-AD 

0.049 

0.166 

0.051 

0.099 

0.100 

n  =  60 

Tytl 

0.323 

ee^h 

eshei 

MSMM 

■^r,n 

0.400 

EiwcM 

BSE 

B^ 

0.028 

0.006 

0.016 

BE 

STCL-CvM 

0.041 

0.335 

0.197 

mSSSM 

STCL-AD 

0.044 

0.283 

EEu^l 

0.162 

n  =  80 

r,n 

0.044 

0.424 

BSSI 

0.079 

42 

•^r.n 

0.506 

■HI 

0.108 

B^ 

0.110 

HH 

0.062 

STCL-CvM 

0.053 

0.486 

0.122 

0.332 

STCL-AD 

0.052 

0.439 

■qEBI 

0.256 

0.306 

o 
o 
1— 1 

II 

^ '  r.n 

HBSI 

0.240 

eessbbi 

E9 

0.290 

B^ 

0.052 

mSm 

0.038 

0.163 

0.182 

STCL-CvM 

0.059 

0.131 

0.372 

0.436 

STCL-AD 

0.052 

HHI 

0.110 

0.338 

130 


Table  36  Empirical  Power  of  Goodness-of-Fit  Statistics  ai  a  —  0.025  and  q  =  0.20. 
Exponential  Distribution  with  Exponential  Censoring 


Alternative  Distribution 

Exponential 

Weibull 

shape=2 

Gamma 

shape=1.5 

Gamma 

shape=2 

Lognormal 
from  N(0,1) 

n  =  20 

r,n 

WSM 

0.268 

Hsm 

42 

■^r,n 

Bl 

0.172 

mSm 

0.007 

0.035 

0.032 

STCL-CvM 

0.013 

0.140 

0.038 

STCL-AD 

0.021 

0.434 

0.112 

0.038 

o 

II 

r.n 

■SB 

MMM 

MSSSM 

BH 

WBm 

BiSB 

wgm 

■■■ 

0.321 

BSIB 

STCL-CvM 

BH 

0.114 

0.429 

STCL-AD 

0.010 

■Hi 

0.098 

0.420 

— 

n  =  60 

Wn 

|[|[|||QR&|||||[| 

0.812 

0.276 

BiB 

B^ 

0.024 

■■■ 

STCL-CvM 

0.020 

0.191 

0.111 

STCL-AD 

0.020 

0.180 

0.685 

0.140 

n  =  80 

■SB 

■■■ 

42 

•^r^n 

HBiH 

■IH 

B^ 

BjRSH 

0.294 

0.867 

■■■ 

STCL-CvM 

0.023 

BBiH 

0.292 

0.851 

0.146 

STCL-AD 

0.025 

0.304 

0.873 

0.218 

o 

o 

T— 1 

II 

r^n 

1.000 

■■■ 

0.984 

■SBI 

42 

1.000 

BIB 

B^ 

1.000 

hb 

STCL-CvM 

0.018 

1.000 

■Hfl 

0.930 

STCL-AD 

0.022 

1.000 

0.429 

0.953 

131 


Table  37  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.025  and  q  =  0.50. 
_ Exponential  Distribution  with  Exponential  Censoring _ 


Alternative  Distribution 

Exponential 

Weibull 

shape=2 

Gamma 

shape=1.5 

Gamma 

shape=2 

Lognormal 
from  N(0,1) 

n  =  20 

r,n 

0.030 

0.040 

■^r.n 

0.030 

BUS 

0.016 

0.021 

0.000 

BitiiiM 

STCL-CvM 

0.016 

0.184 

0.036 

0.079 

STCL-AD 

0.026 

0.134 

0.022 

0.052 

0.039 

II 

o 

' '  r,n 

0.660 

0.060 

WSSSM 

|||_— — 1_ 

-^r,n 

0.493 

0.027 

mSm 

0.028 

0.535 

0.030 

BiB 

0.029 

STCL-CvM 

0.019 

0.572 

0.067 

WS^ 

0.095 

STCL-AD 

0.021 

0.536 

0.043 

0.196 

0.079 

II 

o 

' '  r,n 

0.033 

WBM 

MBM 

•^r,n 

0.028 

■il 

wBm 

B^ 

0.025 

WBM 

■BH 

STCL-CvM 

0.019 

0.828 

0.098 

0.366 

0.162 

STCL-AD 

0.018 

0.812 

0.360 

0.159 

o 

00 

II 

r,n 

■ssai 

0.983 

0.153 

0.553 

■BH 

42 

0.947 

0.382 

B^ 

■sH 

0.981 

0.191 

0.702 

STCL-CvM 

WmM 

0.961 

0.156 

0.543 

STCL-AD 

0.959 

0.151 

0.539 

II 

o 

o 

r,n 

0.997 

n 

HSiH 

42 

^r,n 

■iH 

B^ 

0.023 

0.998 

0.298 

WBm 

STCL-CvM 

0.017 

0.987 

0.213 

mmu 

STCL-AD 

0.015 

0.991 

0.200 

0.742 

0.437 

132 


Table  38  Empirical  Power  of  Goodness-of-Fit  Statistics  at  a  =  0.025  and  q  =  0.80. 
_ Exponential  Distribution  with  Exponential  Censoring _ 


Alternative  Distribution 

Exponential 

Weibull 

shape=2 

Gamma 

shape=1.5 

Gamma 

shape=2 

Lognormal 
from  N(0,1) 

n  =  20 

Wn 

lIRMn 

WSSM 

^r.n 

■Ba 

BffiliB 

0.020 

0.007 

0.000 

STCL-CvM 

0.015 

0.014 

0.022 

0.019 

STCL-AD 

0.015 

B 

0.012 

0.012 

0.011 

o 

II 

r,n 

0.028 

0.034 

^B^B 

0.028 

0.041 

^BBi^b 

0.022 

WtiiiiM 

0.001 

0.000 

0.000 

STCL-CvM 

0.012 

0.102 

0.028 

0.048 

0.047 

STCL-AD 

0.015 

0.060 

0.018 

0.034 

0-028 

n  =  60 

Wn 

■SSI 

0.026 

^r.n 

mSm 

0.031 

BH 

0.001 

STCL-CvM 

0.034 

0.104 

0.113 

STCL-AD 

BBS 

■la 

0.025 

0.072 

0.087 

1! 

00 

o 

'''  r,n 

0.245 

•^rji 

0.024 

0.284 

0.018 

0.000 

0.008 

^BqS^B 

STCL-CvM 

0.021 

0.318 

0.055 

0.156 

0.174 

STCL-AD 

0.020 

0.259 

0.047 

0.120 

0.149 

II 

o 

o 

r,n 

0.359 

0.029 

0.126 

■m^ia 

0.410 

0.030 

0.145 

52 

0.085 

0.002 

mSm 

STCL-CvM 

0.024 

0.418 

0.059 

■a 

^bii^b 

STCL-AD 

0.024 

0.373 

0.041 

0.196 

^beisi 

133 


Empirical  Power 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic  Symbol 

Test  Statistic  Symbol 

Test  Statistic  Symbol 

O  □ 

STCL-CvM  V 
STCL-AD  A 

Exponential  Alternative  with  Exponentiai  Censoring,  Aplha=0.10,  q=0.20 


Sample  Size 

Figure  43  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Exponential, 

O'  =  0.10. 


134 


Empirical  Power 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic  Symbol 

Test  Statistic  Symbol 

Test  Statistic  Symbol 

Ain  0 

STCL-CvM  V 
STCL-AD  A 

■■■gjflllliillllim 

Exponential  Alternative  with  Exponential  Censoring,  Aplha=0.05,  q=0.20 


Figure  44  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Exponential, 

a  =  0.05. 


135 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Wn 

□ 

STCL-CvM 

V 

rnmnamm 

Ain 

0 

STCL-AD 

A 

— 

IIIIH 

Exponential  Alternative  with  Exponential  Censoring,  Aplha=0.025,  q=0.20 


Figure  45  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Exponential, 

a  =  0.025. 


136 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic  Symbol 

Test  Statistic  Symbol 

Test  Statistic  Symbol 

Wn  ° 

AL  0 

STCL-CvM  V 
STCL-AD  A 

■■IglllH 

Weibull  (shape  2)  Alternative  with  Exponential  Censoring,  Aplha=0.10,  q=0.20 


Figure  46  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Weibull 
(shape=2),,  a  =  0.10. 


137 


Empirical  Power 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

r.n 

□ 

STCL-CvM 

V 

WEM 

<> 

STCL-AD 

A 

Weibull  (shape  2)  Alternative  with  Exponential  Censoring,  Aplha=0.05,  q=0.20 


Figure  47  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Weibull 
(shape=2),,  a  =  0.05. 


138 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

rji 

□ 

STCL-CvM 

V 

<> 

STCL-AD 

A 

— 

Weibull  (shape  2)  Alternative  with  Exponential  Censoring,  Aplha=0.025,  q=0.20 


Figure  48  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Weibull 
(shape=2),  a  =  0.025. 


139 


Empirical  Power 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Wn 

□ 

STCL-CvM 

V 

0 

Air. 

❖ 

STCL-AD 

A 

WiWfM 

* 

Gamma  (shape  1.5)  Alternative  with  Exponential  Censoring,  Aplha=0.10,  q=0.20 


Figure  49  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Gamma 
(shape=1.5),  a  =  OTO. 


140 


Empirical  Power 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Wn 

□ 

STCL-CvM 

V 

■■ 

^r.n 

❖ 

STCL-AD 

A 

wESBIl^SMi 

IHH 

Gamma  (shape  1.5)  Alternative  with  Exponential  Censoring,  Aplha=0.05,  q=0.20 


Figure  50  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Gamma 
(shape=1.5),  a  =  0.05. 


141 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Wn 

□ 

STCL-CvM 

V 

B'^ 

0 

o 

STCL-AD 

A 

* 

Gamma  (shape  1.5)  Alternative  with  Exponential  Censoring,  Aplha=0.025,  q=0.20 


Figure  51  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Gamma 
(shape=1.5),  a  =  0.025. 


142 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

STCL-CvM 

V 

0 

STCL-AD 

A 

* 

Gamma  (shape  2)  Alternative  with  Exponential  Censoring,  Aplha=0.10,  q=0.20 


i'igure  52  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Gamma 
{shape=2),  a  =  0.10. 


43 


Empirical  Power 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Wn 

□ 

STCL-CvM 

V 

■BMI 

Al.n 

0 

STCL-AD 

A 

Gamma  (shape  2)  Alternative  with  Exponential  Censoring,  Aplha=0.05,  q=0.20 


Figure  53  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Gamma 
(shape=2),  a  =  0.05. 


144 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic  Symbol 

Test  Statistic  Symbol 

Test  Statistic  Symbol 

AL  0 

STCL-CvM  V 

STCL-AD  A 

Gamma  (shape  2)  Alternative  with  Exponential  Censoring,  Aplha=0.025,  q=0.20 


Fiffure  54  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Gamma 
(shape=2),  a  =  0.025. 


145 


Empirical  Power 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  ExponentiaJity 


Test  Statistic  Symbol 

Test  Statistic  Symbol 

Test  Statistic  Symbol 

STCL-CvM  V 
STCL-AD  A 

30  40  50  60  70  80  90  100 

Lognormal  Alternative  with  Exponential  Censoring,  Aplha=0.10,  q=0.50 


20  30  40  50  60  70  80  90  100 

Sample  Size 


Figure  55  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Lognormal 
from  N(0,1),  a  =  0.10. 


146 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

Test  Statistic 

Symbol 

''''  r,n 

□ 

STCL-CvM 

V 

42 

■^r,n 

o 

STCL-AD 

A 

Lognormal  Alternative  with  Exponential  Censoring,  Aplha=0.05,  q=0.20 


Figure  56  Power  Comparison  of  Tests  for  Exponentiality,  Underlying  Distribution  is  Lognormal 
from  N(0,1),  a  =  0.05. 


147 


Legend  for  Empirical  Power  Study  Plots  for  Tests  for  Exponentiality 


Test  Statistic  Symbol 

Test  Statistic  Symbol 

Test  Statistic  Symbol 

Wn  ° 

Ain  0 

STCL-CvM  V 
STCL-AD  A 

W  0 

n{l-Rl) 

Lognormal  Alternative  with  Exponential  Censoring,  Aplha=0.025,  q=0.20 


20  30  40  50  60  70  80  90  100 

Lognormal  Alternative  with  Exponential  Censoring,  Aplha=0.025,  q=0.50 


Appendix  G,  Matlab  Code  for  2-Parameter  Weihull  MLE 

y,  mle.m 

y  Maximum  Likelihood  Estimation  for  2-parameter  Weibull 
y  using  iterative  Newton-Raphson  approach  given  in  the  Leemis  textbook 
y  "SORTDATA”  is  a  complete  set  of  failure  and  withdrawal  times 

y  Written  by  Dave  Reineke,  1997 

clear  NREM  LX  LY  Lx  Bml  INITB  fail 

n=length(SORTDATA) ; 

k=0; 

for  j=l:n 

if  S0RTDATA(j,2)==l 
k=k+l; 

fail (k) =S0RTDATA ( j , 1)  J 
end; 
end; 


if  k  >=  2 
k=0; 

for  j=l:n 

if  S0RTDATA(j,2)==l 
k=k+l ; 

NREM(k)=n-j+l; 

end; 

end; 

LX=log(fail); 

LY=log(log((n+l) ./NREM)) ; 

LXbar=meeLn(LX)  ; 

LYbar=mean(LY) ; 

Nsum=sum( (LX-LXbar) . * (LY-LYbar) ) ; 

Dsum=sum(  (LX-LXbar)  .  ^^2)  ; 

INITB=Nsum/Dsum; 

Lx=log(SORTDATA (:,!)); 

Lsum==sum(  (SORTDATA( : ,  1)  .  '‘INITB)  .  ♦Lx)  ; 

Psum=sum(SORTDATA( :  ,1)  .  ''INITB)  ; 

LPsum=sum( (Lx.  '‘2)  .  ♦SORTDATA( : ,  1)  .  ''INITB)  ; 

Bml=INITB- (k/INITB+sum(LX) -k*Lsum/Psum) / (-k/INITB''2-k/Psum''2*  (Psum*LPsum-Lsum"2)  )  ; 
DIFF=1 ; 


149 


C=l; 

while  DIFF  >  .00001 
C=C+1; 

if  C  ==  1001 
C 

BETAhat=Biiil(C)  ; 
break; 
end; 

Lsum=suni(  (S0RTDATA( : ,  1)  •  ''Bml(C-l)  )  .  *Lx)  ; 

Psiam=siim(SORTDATA( : ,  1)  .  ''Bml(C-l)  )  ; 

LPsiim=sum( (Lx.  '‘2)  .  *S0RTDATA( : ,  1)  .  ''Binl(C-l) )  ; 

Bml  (C)  =Bml  (C-1)  -  (k/Bml  (C~l)+suia(LX)  -k+Lsum/Psinn)/  (-k/Bml  (C-1)  "2-k/Psiini''2*  (Psiun*LPsum-Lsum''2)  )  ; 

DIFF=abs(Bml(C)  -  Binl(C-l)); 

end; 

end; 

BETAhat=Bml(C) ; 

ETAhat=(siim(SORTDATA( ; ,  1)  .  ^BETAhat)/k)  .  (1/BETAhat)  ; 


MUhat=ETAhat *gamma ( 1/BETAhat  +  1)  ; 


Appendix  H.  Matlab  Code  for  Minimum  Distance  Estimation  of  a  Weibull 

Location  Parameter 

7fl  mde.m  Finds  the  minimum  distance  estimate  (MDE)  of  the  location 
7o  parameter  of  a  3  parameter  Weibull  distribution. 

y.  Written  by  Dave  Reineke,  1998 

7o  SORTDATA  is  a  n  by  2  matrix  of  failure  and  withdrawal  times  (  -  loc)  . 

7o  FAIL  is  the  set  of  failure  times  only. 

7#  KME  is  the  Kaplan-Meier  estimator  of  the  distribution  function. 


t=l-exp(-((FAIL-GAMhatl)  ./ETAhatl) . '^BETAhatl) ; 

7*  MDE  using  the  Cramer-von  Mises  statistic  (KME  vs  MLE) 
diff=l; 

7#  Golden  Search  algorithm 

alf=2/(l+sqrt (5)) ; 

lt=0; 

rt=min(data(;  ,1)) ; 
cl=0; 

while  diff  >  .00000001 
cl=cl+l; 
if  cl  ==  1000 
cl 

GAMhatW=min(data( : , 1) ) ; 
break; 
end; 

xl=lt  +  (1-alf  )*(rt--lt)  ; 
x2=lt  +  alf*(rt“lt); 

t=l-exp(-((FAIL~xl) ./ETAhatl) .^BETAhatl) ; 

Wlmd=0; 
for  i=l:r-l 

Wlmd=Wlmd  +  (KME(i)^2)*(t(i+l)-t(i))  -  KME(i)*(t (i+l)''2“t (i)''2)  ; 
end; 

W2mdl=n/3  +  n*(Wlmd  +  t  (r) ''2-t  (r)) ; 

t=l-exp(-((FAIL-x2)  ./ETAhatl)  .  '^BETAhatl)  ; 
w2md=0 ; 
for  i=l:r-l 

w2md=w2md  +  (KME(i) '^2)  ♦  (t  (i+l)-t  (i))  ~  KME(i)*(t  (i+l)^2-'t(i)''2)  ; 
end; 

W2md2=n/3  +  n*(w2md  +  t  (r) ''2-t  (r))  ; 

if  W2mdl  <  W2md2 
rt=x2 ; 


151 


GAMhatW=xl; 
else 
lt=xl ; 

GAMhatW=x2 ; 
end; 

diff=abs(W2mdl“W2md2)  ; 
end; 

y,  MDE  using  the  Anderson-Darling  statistic  (KME  vs  MLE) 
diff=l; 

Vo  Golden  Search  algorithm 
alf=2/(l+sqrt(5)) ; 


lt=0; 

rt= .  999*min  (data( :,!)); 


c2=0; 

while  diff  >  .00000001 
c2=c2+l ; 
if  c2  ==  1000 
c2 

GAMhatA=min(data( :,!)); 
break ; 
end; 

xl=lt  +  (1-alf )*(rt-lt) ; 
x2=lt  +  alf*(rt-lt); 

t=l-exp(-((FAIL-xl) ./ETAhatl) . ^BETAhatl) ; 

[tminl , I 1] =min (t ) ; 

[tmaxl , 13] =max(t) ; 

if  tminl==0 
if  tmaxl  ==  1 
if  I3-1-I1==1 


A2mdl=-n+n*(-log(t(Il+l))  -  log(l-t(Il+l))) ; 
elseif  I3-1-I1>1 

Almd=-(KME(Il+l)^2)*log(t(Il+l))-(i-(KME(Il+l)-l)^2)*log(l-t(Il+l)) ; 
for  i=Il+l:I3-l 

Almd=Almd  +  (KME(i-l) '^2-KME(i) "2)*log(t(i))-((KME(i-l)-l)^2-(KME(i)-l) ^2)*log(l-t(i)) ; 
end; 

A2mdl=-n+n*Almd ; 


152 


end; 


elseif  tmaxl  <  1 
if  r-Il==l 


A2nidl=-n+n*(-log(t(r))  -  log(l-t  (r)) )  ; 


elseif  r-Il>l 

Almd=-(KME(Il+l)''2)*log(t(Il+l))-(l-(KME(Il+l)-l)''2)*log(l-t(Il+l)); 
for  i=Il+2;r 

Almd=Almd  +  (KME(i-l)^2-KME(i)''2)*log(t (i))-((KME(i~l)-l)"2~(KME(i)-l) ^2)*log(l-t(i)) j 

end; 

A2mdl=-n+n*Almd ; 
end; 
end; 

elseif  tminl  >  0 
if  tmaxl— 1 
if  I3==2 


A2mdl=-n+n*(-log(t(l))  -  log(l-t(l))); 
elseif  13  >  2 

Almd=-(KME(l)''2)*log(t(l))-(l-(KME(l)-l)‘'2)*log(l-t(l)); 
for  i=2:I3‘-l 

Almd=Almd  +  (KME (i-1)  '^2-KME (i)  *^2)  *log(t  (i) ) -  ( (KME(i-l) -1)  ^2-  (KME(i)  -1)  *^2)  ♦logd-t  (i)  )  ; 
end; 

A2md l=~n+n* A Imd ; 
end; 

elseif  tmaxl  <  1 

Almd=-(KME(l)'^2)*log(t(l))-(l-(KME(l)-l)^2)*log(l-t(l)); 
for  i=2:r 

Almd=Almd  +  (KME(i~l)''2-KME(i)''2)*log(t(i))-((KME(i-l)-l)^2-(KME(i)-l)''2)*log(l-'t(i)) ; 
end; 

A2mdl=-n+n*Almd ; 

end; 

end; 

t=l-exp(-((FAIL-x2) ./ETAhatl) . ^BETAhatl) ; 

[tmin2,I2]=min(t)  ; 

[tmax2 , 14]  =max  (t)  ; 

if  tmin2==0 
if  tmax2  ==  1 
if  I4-1-I2==l 


153 


A2md2=-n+n*(-log(t(I2+l))  -  log(l-t(I2+l))) ; 


elseif  I4-1-I2>1 

A2md=-(KME(I2+l)"2)*log(t(I2+l))-(l-(KME(I2+l)-l)''2)*log(l-t(I2+l)); 
for  1=12+1:14-1 

A2nid=A2md  +  (KME(i-l) '^2-KME(i) ''2)*log(t (!))-( (KME(i-l)-l) ''2-(KME(i)-l) ''2) ♦log(l-t(i))  ; 
end; 

A2md2=-n+n*A2md ; 
end; 

elseif  tmax2  <  1 
if  r-I2==l 


A2md2=-n+n*(-log(t(r))  -  log(l-t (r))) ; 
elseif  r-I2>l 

A2md=-(KME(I2+l)'‘2)*log(t(I2+l))-(l-(KME(I2+l)-l)''2)*log(l-t(I2+l)); 
for  i=I2+2:r 

A2md=A2md  +  (KME(i-l)-'2-KME(i)''2)*log(t(i))-((KME(i-l)-l)''2-(KME(i)-l)'‘2)*log(l-t(i))  ; 

end; 

A2md2=-n+n* A2md ; 
end; 
end; 

elseif  tmin2  >  0 
if  tmax2==l 
if  I4==2 


A2md2=-n+n*(-log(t(l))  -  log(l-t (1)) ) ; 
elseif  14  >  2 

A2md=-(KME(l)'^2)*log(t(l))-(l-(KME(l)-l)^2)*log(l-t(l)); 
for  i=2;I4-l 

A2ind=A2md  +  (KME(i-l)^2-KME(i)^2)*log(t (i))-((KME(i-l)-l) ''2-(KME(i)-l)''2)*log(l-t(i))  ; 
end; 

A2md2=-n+n*  A2ind ; 
end; 

elseif  tmax2  <  1 

A2md=-(KME(l)^2)*log(t(l))-(l-(KME(l)-l)''2)*log(l-t(l)); 
for  i=2:r 

A2md=A2ind  +  (KME(i-l)''2-KME(i)'^2)*log(t(i))-((KME(i-l)-l)"2-(KME(i)-l)''2)*log(l-t(i)) ; 
end; 

A2md2=-n+n*  A2nid ; 
end; 


154 


end; 


if  A2mdl  <  A2md2 
rt=x2 ; 

GAMhatA=xl ; 
else 
lt=xl ; 

GAMhat A=x2 ; 
end; 

diff=abs(A2indl-A2ind2)  ; 
end; 


155 


Appendix  1.  Matlab  Code  for  Distribution  Function  Estimators  and  MISE 

%  MISEexp.m 

y#  This  program  estimates  the  Mean  Integrated  Squared  Error  (MISE) 

%  for  kernel  density  estimators,  ML  estimators,  the  KME,  and  the  PEXE  for 
y»  randomly  censored  data 

y.  Written  by  Dave  Reineke,  1998 


nrep=25000 ; 
for  n=20; 

y,  Numerical  integration  will  be  performed  using  Simpson’s  rule  with 
y  101  function  evaluations.  The  weight  vector  is  as  follows: 

tic 

wt(l)=l; 
for  i=2:2:100 
wt(i)=4; 
end; 

for  i=3:2:99 
wt(i)=2; 
end; 

wt(101)=l; 

rand (’seed’ , 1) ; 

beta=l; 
eta=50 ; 
loc=0; 
q=.80; 

Thet a=et a* ( 1-q) /q ; 
m=0; 

while  m  <  nrep 

clear  FAIL  RATE  EXSUM  PEXE  STTT  phi  KM  KME  s  H  sf  P  J 


y  Construct  a  randomly  censored  data  set 
for  i=l:n 

T(i)=eta*(-log(l“rand)) , "(1/beta)  +  loc; 
C=exprnd (Theta)  +  loc; 
X(i,l)=min(T(i),C); 
if  C<T(i) 

X(i,2)=0; 

else 

X(i,2)=l; 

end; 


156 


end; 

T=sort (T) ; 

[data,I]=sort(X(: ,1)) ; 
for  i=l:n 

data(i,2)=X(I(i),2); 

end; 

r=0; 

for  i=l:n 

if  data(i,2)“l 
r=r+l ; 

FAIL(r)=data(i,l); 

end; 

end; 

if  r>l 
in=m+l; 

qobs  (]ii)=l-r/n; 

y»  Find  the  ML  estimate  of  scale 

ETA=siim(data( :  ,l))/r; 

lower=0 ; 
upper =max(T) ; 
dx= (upper-lower) / 100 ; 
x=lower ; dx : upper ; 

F=l“exp (-X . /eta) ; 

Fnmle=(F  -  (l-exp(-x. /ETA) ))  . '‘2; 

Imle (m) =wt*Fnmle ’ *dx/3 ; 


7o  Construct  the  Kaplan-Meier  PL  estimator  of  the  df 
for  j=l:n 

KM(j)  =  ((n-j)/(n-j+l))''(data(j,2)) ; 
end; 

KM=cumprod(KM) ; 

if  data(n,2)“0 
KM(n)=0; 
end; 

for  j=l : length (x) 
if  x(j)<=data(l,l) 


157 


KME(j)=l; 

end; 

for  i=2:n 

if  x(j)>data(i-l,l) 
if  x(j)<data(i,l) 
KME(j)=KM(i-l); 
end; 
end; 
end; 


if  x(j)  >=  data(n) 

KME(j)=0; 

end; 

end; 

FnKME=(F  -  (1-KME)).^2; 

IKME (m) =wt  *FnKME  >  *dx/3 ; 

%  Construct  Sweeder’s  estimator  using  the  KME 

k=4; 

mm=f loor(n/k) ; 

R=n-'k*inm; 

for  1=1  :R 
for  j=l;mm+l 

YX(j ,l)=data(l+k*(j-l) , 1) ; 

YI(j,l)=ciata(l+k*(j-l),2); 

end; 

end; 

for  l=R+l:k 
for  j=l:mm 

YX(j,l)=data(l+k*(j-l),l); 

YI (j ,l)=data(l+k*(j-l) ,2) ; 
end; 

end; 

for  1=1 :R 
for  j=l:mm+l 

kmsub(j,l)  =  ((imn+l-j)/(nim+l-j+l))  .*YI(j,l) ; 
end; 

end; 


for  l=R+l:k 


158 


for  j=l:iiini 

kmsub  (  j  ,  1)  =  (  (mm- j  )  /  (mm- j+1)  )  .  "  YI  (  j  ,  1)  ; 
end; 
end; 

KM sub=l-cumprod (kmsub) ; 


for  1=1 :R 

extrap (1, l)=max ( (2*YX(1 ,1)-YX(2,1))  ,data(l,l)) ; 
extrap(2,l)=min((2*YX(mm,l)-YX(mm-l,l)) ,data(n, 1)) ; 

for  j=l : length (x) 

if  x(j)<  extrapd,!) 

Fhatsub(j  ,1)=0; 
end; 

if  x(j)  >=extrap(l,l) 
if  x(j)  <  YX(1,1) 

Fhat  sub ( j , 1)  =  ( (KMsub (1,1)) /2) * ( 1-cos (pi* (x ( j ) -extrap (1 , 1) ) / ( YX (1 , 1) -extrap (1,1)))); 
end; 
end; 

for  i=l;mm 

if  x(j)>=YX(i,l) 
if  x(j)<  YX(i+l,l) 

Fhatsub(j,l)=KMsub(i,l)+((KMsub(i+l,l)-KMsub(i,l))/2)*(l-cos(pi*(x(j)-YX(i,l))/(YX(i+l,l)-YX( 

end; 

end; 

end; 

if  x(j)  >=  YX(mm+l,l) 
if  x(j)  <  extrap (2,1) 

Fhat  sub  (j  ,  1)  =KMsub  (mm+1 , 1)  +  (  (1-KMsub  (mm+1 , 1)  )  /2)  *  ( 1-cos  (pi*  (x  ( j  )  -YX  (mm+1 , 1) )  /  (extrap  (2 , 1)  -YX 
end; 
end; 

if  x(j)  >=  extrap (2,1) 

Fhat sub (j ,1)=1 ; 
end; 
end; 
end; 

for  l=R+l:k 

extrap (l,l)=max((2*YX(l,l)-YX (2,1)) ,data(l, 1))  ; 
extrap(2,l)=min((2*YX(mm,l)-YX(mm-i,l))  ,data(n,  1)) ; 


159 


for  j =1: length (x) 


if  x(j)<  extrap (1,1) 

FhatsubCj ,1)=0; 
end; 

if  x(j)  >=extrap(l,l) 
if  x(j)  <  YX(1,1) 

Fhat sub ( j , 1)  =  ( (KMsub( 1 , 1) ) /2) * ( 1-cos (pi* (x ( j ) -extrap (1 , 1) ) / (YX (1 , 1) -extrap (1 , 1)  )  )  )  ; 
end; 
end; 

for  i=l:iiim-l 

if  x(j)>=YX(i,l) 
if  x(j)<  YX(i+l,l) 

Fhatsub(j,l)=KMsub(i,l)+((KMsub(i+l,l)-KMsub(i,l))/2)*(l-cos(pi*(x(j)-YX(i,l))/(YX(i+l,l)-YX( 

end; 

end; 

end; 


if  x(j)  >=  YX(inm,l) 
if  x(j)  <  extrap (2,1) 

Fhat  sub  (  j  ,  1)  =KMsub  (mm,  1)  +  (  (1-KMsub  (mm  ,l))/2)*(l-cos  (pi*  (x  ( j  )  -YX(mm ,  1)  )  /  (extrap  (2 , 1)  -YX  (mm  ,1) 
end; 
end; 

if  x(j)  >=  extrap (2,1) 

Fhat sub (j ,1)=1; 
end; 

end; 

end; 

Fhat=mean (Fhat sub ’ ) ; 

Fn=(F-Fhat) ,^2; 

ITS JKME (m) =wt  *Fn » *dx/3 ; 

X  Mean  Order  Number  Estimator 

r=0; 

if  data(l,2)==l 
r=l; 

P(l)=l; 

J(l)=l; 

end; 

for  i=2:n 


160 


if  data(i,2)==l 
r=r+l; 
if  r==l 

P(l)  =  (n+l)/(n-i+2); 
J(l)=P(l); 
elseif  r  >  1 

if  data(i’-l,2)==0 

J(r)  =  (n+l-P(r~l))/(ii-i+2) ; 
else 

J(r)=J(r~l) ; 
end; 

P(r)=P(r-l)+J(r); 

end; 

end; 

end; 

sf=P/n; 

for  j =1: length (x) 

if  x(j)<=FAIL(l) 

M0N(j)=0; 

end; 

for  i=2:r 

if  x(j)>FAIL(i-l) 
if  x(j)<=FAIL(i) 

M0N(j)=sf (i-1); 
end; 
end; 
end; 


if  x(j)  >  FAIL(r) 

M0N(j)=l; 

end; 

end; 

Fn=(F-M0N).^2; 

IM0N(ia)=wt*Fn’*dx/3; 

%  Construct  the  PEXE  of  the  df 

k=0;  N=0;  R=0; 
if  data(l,2)==l,  k=l; 

STTT(l)=n*data(l,l); 
elseif  data(l ,2)==0,  R=data(l,l); 

N=l; 

end; 


161 


for  i=2:n 

if  data(i,2)==l 
if  k==0 
k=k+l; 

STTT(k)  =  (ii-i+l)*data(i,l)  +  R; 

else 
k=k+l ; 

STTT(k)=(n~i+l)*data(i,l)  +  R  -  (N+n-i+1) tdataCi-CN+l) , 1) ; 
end; 

R=0;  N=0; 

elseif  data(i,2)==0 
R=R+data(i,l) ; 

N=N+1 ; 
end; 
end; 

for  i=l:k 

RATE(i)=l/STTT(i); 

end; 

NEXT=0; 

EXSUM(1)=RATE(1)*FAIL(1) ; 
for  i=2:k 

NEXT==RATE(i)*(FAIL(i)  -  FAIL(i-l)); 

EXSUM(i)=EXSUM(i-l)  +  NEXT; 
end; 

PTTT(l)=-(l/RATE(l))*(exp(-EXSUM(l))  -  1); 
for  i=2;k 

PIECE=-(l/RATE(i))*(exp(-EXSUM(i))  -  exp(-EXSUM(i-l) )) ; 
PTTT(i)=PTTT(i-l)  +  PIECE; 
end; 

for  i=l:k 

PCDF(i)=l-exp(-EXSUM(i)) ; 
end; 

Ehat=FAIL (r) /sqrt (-log ( 1-PCDF (r ) ) ) ; 

for  j=l:length(x) 

if  x(j)<=FAIL(l) 

PEXE(j)=l-exp(-RATE(l)*x(j)) ; 
end; 

for  i=2:r 

if  x(j)>FAIL(i-l) 
if  x(j)<=FAIL(i) 

PEXE ( j )=l-exp (-EXSUM (i-1) -RATE (i) * (x ( j ) -FAIL (i-1) ) )  ; 
end; 


162 


end; 

end; 


if  x(j)  >  FAIL(r) 

PEXE ( j ) =l-exp ( -EXSUM (r) -RATE (r) * (x ( j ) -FAIL (r) ) ) ; 

end; 

end; 

FnPEXE=(F  -  PEXE).''2; 

IPEXE (m) =wt *FnPEXE ’ ♦dx/S ; 


7o  Construct  the  Foldes-Rejto-Winter  kernel  estimate  of  the  df 

sd=std(FAIL) ; 

h^sd^n-^C-l/S); 

s(l)=l-KM(2); 
for  j=2:n-l 

s(j)=KM(j)-KM(j+l); 

end; 

s(n)=KM(n)  ; 

DAT=data( ; , 1) ; 
keFRW=0; 
for  j=l:n 

keFRW=keFRW  +  s(j)*normcdf ((x^-DAT(j , ;)*ones(size(x’)))/h) ; 
end; 

FnKEFRW=(F  -  keFRWO.''2; 

IKEFRW (m) =wt  *FnKEFRW ’ *dx/3 ; 


y#  Construct  the  Blum-Susarla  kernel  estimate  of  the  df 
for  j=l;n 

H( j  )  =  (  (n- j+1)  /  (n- j+2) )  (l-data( j  , 2)  )  ; 
end; 

H=cumprod(H) ; 

DAT=data(: ,1) ; 

keBS=0; 

for  j=l:n 

keBS=keBS  +  data(j , 2) ♦normcdf ( (x ’ -DAT ( j , : ) *ones(size(xO ))/h)/(H(j)*n) ; 
end; 

FnKEBS=(F  -  keBS>).''2; 

IKEBS (m) =wt *FnKEBS ’ *dx/3 ; 


163 


%  Klein,  Lee,  &  Moeschberger  Survivor  Function  Estimator 


SORTDATA=data; 

mle 

for  j=l: length (x) 
for  i=l:n 

if  x(j)  <  data(i,l) 
phi(i, j)=l; 
else 

if  data(i,2)~l 
phi(i, j)=0; 
else 

phi(i, j)=exp(-'(x(j)/ETAhat) . "BETAhat  +  (data(i, l)/ETAhat) . "BETAhat) ; 
end; 
end; 
end; 
end; 

KLM=sum(phi)/n; 

FnKLM=(F  -  (1-KLM)  )  . ''2; 

IKLM (m) =wt  *FnKLM  >  *dx/3 ; 


end; 

end; 

qbar=mean(qobs) ; 

mlmle=mean(lmle) ; 
slmle=std(lmle) ; 

mIKME=mean(IKME) ; 
sIKME=std(IKME); 

mITSJKME=mean(ITSJKME) ; 
sITSJKME=std(ITSJKME) ; 

mIM0N=mean(IM0N) ; 
sIM0N=std(IM0N); 

mIPEXE=mean(IPEXE) ; 
sIPEXE=std(IPEXE) ; 

mIKEFRW=mean(IKEFRW) ; 
sIKEFRW=std(IKEFRW) ; 


164 


inIKEBS=inean(IKEBS)  ; 
sIKEBS=std(IKEBS) ; 

mIKLM=naimiean(IKLM) ; 
sIKLM=nanstd(IKLM) ; 


fid=fopeii(^ MISEexp8.txt’ ,  ’aO  ; 

fprintf  (fid, ’Output  for  samples  of  size  y,2.0f  \n’,]i); 

fprintf  (f  id,  ’  (Monte  Carlo  size  =  y,g)  \n’,nrep); 

fprintf  (fid, ’Expected  proportion  of  censoring  q  =  */,!  .2f \n’ ,q)  ; 

fprintf  (fid, ’Observed  proportion  of  censoring  q  =  . 2f \n’ ,qbar)  ; 

f printf (fid,’  \n  ’ )  ; 

fprintf (fid, ’Failure  distn:  Exp.  scale=yg\n’ ,eta) ; 
fprintf (fid, ’  \n’); 

fprintf  (fid, ’Censoring  distn:  Exponential  scale=yog\n’ ,  Theta)  ; 
fprintf (fid, ’  \n’); 

fprintf (fid, > )  ; 
fprintf (fid,’  \n  ’ )  ; 

fprintf (fid, ’MISE  for  the  following  df  estimators\n’) j 
fprintf (fid,’  \n  ’ )  ; 

fprintf (fid,’ MLE  KME  TSJKME  MONE  PEXE  FRWKE  BSKE  KLME\n’); 
fprintf (f id , ’mean : \n ’ ) ; 

fprintf(fid,’yol.4f  &  y.l.4f  &  ya.4f  &  %1.4f  &  ya.4f  &  y,1.4f  &  ypl.4f  &  y.l.4f\n’,mImle,mIKME,mITSJKME 
fprintf (fid,’ std  dev . : \n ’ ) ; 

fprintf(fid,’(ypl.4f)  &  (y,1.4f)  &  (%1.4f)  &  (ypl,4f)  &  (y,1.4f)  &  (•/,!. 4f)  &  (yol,4f)  &  (“/.l .4f )\n’ , siml 
fprintf (fid,’  \n ’ ) ; 

fprintf  (fid,  ’  *******3te****)i«********5*t********************5ie*************>ie*****\]i’ )  ; 
f close (fid) ; 


sprintf (’MISEexp  with  n=ypg  and  q=y,g’,n,q) 

end; 

toe 

quit 


165 


Appendix  J.  Percentage  Point  Generation  for  the  Exponential  with  Exponential 

Censoring 

%  EECMS.m  Exponential, Exponential  Censoring  Model 

Vo  Monte  Carlo  study  of  CvM  and  AD  GOF  test  statistics  for  randomly 
Vo  censored  data  using  the  KME  in  place  of  the  EDF. 

y#  Expected  proportion  censored:  q 

y#  Composite  Hypothesis:  Exponential  with  Exponential  censoring 

y,  Written  by  Dave  Reineke,  1998 

for  n=20:20:100 

rand ( ^ seed ^ , 1) ; 

nrep=250000 ; 
m=0; 

q=.9; 

beta=2 ; 
eta=50; 

Theta=eta* ( 1-q) /q; 
y#  Theta=415; 

tic 

while  m  <  250000 

clear  FAIL  KM  KME  km  U  t 

%  Construct  a  randomly  censored  data  set 

for  i=l:n 

t=eta* (-log (1-rand)) ; 

%  t=et a* (-log (1-rand)) ."(1/beta) ; 

y  t=5+ ( 120-5)  *rand;  */,  Alternative  distribution 

C=Thet a* (-log (1-rand)) ; 

X(i,l)=min(t,C) ; 
if  C<t 
X(i,2)=0; 
else 

X(i,2)=l; 

end; 

end; 

[data,I]=sort (X(: , 1)) ; 
for  i=l:n 

data(i,2)=X(I(i),2); 

end; 


166 


for  i=l:n 

kin(i)  =  ((n-i)/(ii-i+l))  ."data(i,2) ; 
end; 

KM=ciimprod(km)  ; 

KM(n)=0; 

r=0; 

for  i=l:n 

if  data(i,2)==l 
r=r+l ; 

FAIL(r)=data(i,l); 

KME(r)=:l-KM(i); 

end; 

end; 


if  r>=2 
ni=in+l ; 

qobs  (m)=l--r/n; 

KME(r)=l; 

ETAhat=sum(data(:  ,  l))/r;  */,  Scale  MLE  for  Exponential  distn  only 

U=l-exp(--FAIL/ETAliat) ; 

*/a  Construct  the  CvM  statistic 

Wsuni=0 ; 
for  i==2:r 

Wsuin==Wsmn+(KME(i-l)^2)*(U(i)-U(i-l))-KME(i-l)*(U(i)'^2-U(i-l)^2)+(U(i).''3-U(i-l).^3)/3; 

end; 

W2(m)=n*(U(l).''3)/3  +  n+Wsum; 

%  Construct  the  AD  statistic 


if  U(1)==0 


Asuni=0 ; 
for  i=3:r 

Asuni=Asum+(KME(i-l).''2)*(log(U(i))-log(U(i-l)))-((KME(i-l)-l)."2)*(log(l-U(i))-log(l-U(i-l)) 

end; 

A2  (m)  =-n*  (U(2)  +log  ( 1-U(2)  )  )  +n*Asum; 

elseif  U(1)>0 

Asum=0 ; 
for  i=2:r 

Asum=Asum+(KME(i-l).^2)*(log(U(i))-log(U(i-l)))-((KME(i-l)-l).^2)*(log(l-U(i))-log(l-U(i~l)) 

end; 

A2  (m)  =:-'n*  (U  ( 1 )  +log  (1-11(1)))  +n*  Asum ; 

end; 


167 


end; 

end; 

qbar=niean  (qobs  ) ; 

mW2=mean(W2)  ; 
sW2=std(W2); 
niinW2=znin(W2)  ; 
medW2=prctile(W2,50) ; 
inaxW2=max(W2) ; 

ppW2 ( 1 ) =prct ile (W2 , 75) ; 
ppW2(2)=prctile(W2,80) ; 
ppW2(3)=prctile(W2,85) ; 
ppW2 (4) =prct ile (W2 , 90) ; 
ppW2(5)=prctile(W2,95) ; 
ppW2(6)=prctile(W2,97.5) ; 
ppW2(7)=prctile(W2,99) ; 
ppW2(8)=prctile(W2,99.5) ; 
ppW2(9)=prctile(W2,99.9) ; 

mA2=mean(A2) ; 
sA2=std(A2) ; 
minA2=min(A2) ; 
inedA2=prctile(A2,50)  ; 
maxA2=max(A2) ; 

ppA2(l)=prctile(A2,75) ; 
ppA2(2)=prctile(A2,80) ; 
ppA2(3)=prctile(A2,85) ; 
ppA2 (4) =prct ile (A2 , 90) ; 
ppA2(5)=prctile(A2,95) ; 
ppA2 (6) =prct ile (A2 ,97.5); 
ppA2(7)=prctile(A2,99) ; 
ppA2(8)=prctile(A2,99.5) ; 
ppA2 (9) =prct ile (A2 ,99.9); 


f id=f open ( ’ EECMm9 . txt ’ , ’ a  0 ; 

f printf (fid,’ Source :  EECM3 . m\n ’ ) ; 

fprintf  (fid, ’Sample  size:  n  =  y,g\n’ ,n)  ; 

fprintf  (fid, ’Monte  Carlo  size;  N  =  •/og\n’,m); 

fprintf (fid, ’Expected  prop,  censored:  q  =  %1.2f\n’,q); 

fprintf  (fid, ’Observed  prop,  censored;  qbar  =  y»1.2f\n’ ,qbar)  ; 

fprintf (fid, ’  \n’); 

fprintf (fid, ’Failure  dist:  EXP  scale  =  %g\n’,eta); 

fprintf (fid, ’Censoring  distn:  EXP  scale  =  ^gXn’ , Theta) ; 

fprintf (f id, ’  \n’)  ; 

fprintf  (fid,  ’  ♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦************t***t*t**t****2|c***************\ii’ )  ; 
fprintf (fid,’  \n ’ ) ; 


168 


fprintf (fid, ’Percentage  Points  of  the  CvM  statistic:  Composite  Ho\n’); 
fprintf (fid, ’  \n’) ; 

fprintf  (fid,’  0.25  &  0.20  &  0.15  &  0.10  &  0.05  &  0.025  &  0.01  &  0.005  &  0 . 001\n’ ) ; 

fprintf (fid,’  \n ’ ) ; 

fprintf  (fid, ’•/,!. 4f  &  yj.4f  &  %1.4f  &  y.l.4f  &  y,1.4f  &  y,1.4f  k  7.1  Af  &  ya.4f  &  y.l  .4f  \n’ ,ppW2) ; 
fprintf (fid,’  \n ’ ) ; 

fprintf (fid, ’Minimum  min  =  Xl .6f\n’ ,minW2) ; 

fprintf  (fid, ’Median  med  =  y«l  .6f\n’ ,medW2)  ; 

fprintf  (fid,  ’Maximum  max  =  yi  .6f  \n’  ,maxW2)  ; 

fprintf  (fid, ’Mean  E[W2]  =  y.l .  6f\n’ ,mW2)  ; 

fprintf  (fid, ’Standard  Deviation  sd[W2]  =  %!  .6f\n’ ,sW2)  ; 

fprintf (fid,’  \n ’ )  ; 

fprintf (fid,’  \n ’ ) ; 

fprintf (fid, ’Percentage  Points  of  the  AD  statistic:  Composite  Ho\n’); 
fprintf (fid, ’  \n’); 

fprintf  (fid,’  0.25  &  0.20  &  0.15  &  0.10  &  0.05  &  0.025  &  0.01  &  0.005  &  0.001\n’) ; 

fprintf (fid, ’  \n’); 

fprintf  (fid, ’•/.!. 4f  &  y.l.4f  &  y.l.4f  &  y.l.4f  &  y.l.4f  &  y.i.4f  &  y.l.4f  &  y.l.4f  &  y.l.4f\n’,ppA2); 
fprintf (fid, ’  \n’); 

fprintf  (fid,  ’Minimum  min  =  y.l  .6f\n’  ,minA2)  ; 

fprintf  (fid, ’Median  med  =  “/.1 .6f  \n’ ,medA2) ; 

fprintf  (fid,  ’Maximum  max  =  y.l  .6f\n’  ,maxA2)  ; 

fprintf  (fid, ’Mean  E[A2]  =  y,1.6f\n’ ,mA2)  ; 

fprintf  (fid, ’Standard  Deviation  sd[A2]  =  y.l,6f\n’ ,sA2)  ; 

fprintf  (fid,  ; 

fprintf (fid,’  \n ’ ) ; 
fprintf (fid, ’  \n’); 

f close (fid) ; 

sprintf (’ Output :  EECMm9.txt  with  n=y,g  and  q=y,g’,n,q) 

toe 

end; 

quit 


169 


Appendix  K,  Percenatge  Point  Generation  for  the  Weibull  with  Exponential 

Censoring 

y,  W2ECMl.m 

y,  Monte  Carlo  study  of  the  CvM  and  AD  GOF  test  statistics  for  randomly 
y#  censored  data  using  the  KME  in  place  of  the  EDF. 

y,  Expected  proportion  censored:  q 

y.  Composite  Hypothesis:  Weibull  with  Exponential  censoring 
y#  Written  by  Dave  Reineke,  1998 
for  n=20:20:100 


rand (’seed’ ,1) ; 
nrep=250000 ; 
m=0; 

kappa=2 ; 
eta=50 ; 
loc=20; 

Theta=80 . 5 ; 
q=.40; 

tic 


while  m  <  nrep 

clear  FAIL  KM  KME  KMt  km  U  t 

y»  Construct  a  rsoidomly  censored  data  set 

for  i=l:n 

T=e t a* ( -log (1-r and) )." (1/kappa)  +  loc; 
y,  T=5+(  120-5) *r and;  %  Alternative  distribution 

C=Thet a* (-log (1-rand))  +  loc; 

X(i,l)=min(T,C); 
if  C<T 
X(i,2)=0; 
else 

X(i,2)=l; 

end; 

end; 

[data,I]=sort (X(: ,1)) ; 
for  i=l:n 

data(i,2)=X(I(i),2); 

end; 

for  i=l:n 


170 


km(i)=(  (ii-i)/(ii--i+l))  .  "data(i,2)  ; 
end; 

KM=ciiinprod(km)  ; 

KM(n)=0; 

r=0; 

for  i=l:n 

if  data(i,2)~l 
r=r+l ; 

FAIL(r)=data(i,l); 

KME(r)=l~KM(i); 

end; 

end; 

if  r>=2 
in=m+l ; 

qobs(m)=l-r/n; 

KME(r)=l; 


SORTDATA=data; 

)|c%3|(t*********Find  MLEs  for  location  &  scale  parameters***************** 
y,  shape  is  known***************************** 

%  mdloc.m  Finds  the  minimum  distance  estimate  (MDE)  of  the  location 
y,  parameter  of  a  3  parameter  Weibull  distribution  assuming 
%  known  shape. 

y#  Written  by  Dave  Reineke 

%  SORTDATA  is  a  n  by  2  matrix  of  failure  and  withdrawal  times, 
y,  FAIL  is  the  set  of  failure  times  only. 

%  KME  is  the  Kaplan-Meier  estimator  of  the  distribution  function. 
GAMhatl=. 999*data(l , 1) ; 

ETAhatl=(sum((data( : ,l)-GAMhatl) . "kappa) /r) . "(1/kappa) ; 
t=l-exp(-((FAIL-GAMhatl) ./ETAhatl) ."kappa) ; 

%  MDE  using  the  Anderson-Darling  statistic  (KME  vs  MLE) 
diff=l; 

y  Golden  Search  algorithm 
alf=2/(l+sqrt (5)) ; 


lt=0; 

rt=min(FAIL) ; 


171 


c2=0; 

while  diff  >  .00000001 
c2=c2+l; 
if  c2  ==  1000 
c2 

GAMhatA=.999*data(l,l) ; 
break ; 
end; 

xl=lt  +  (1-alf )*(rt-lt) ; 
x2=lt  +  alf*(rt-lt); 


t=l-exp(~((FAIL~xl) ./ETAhatl) . "kappa) ; 
if  t(l)==0 
if  r>2 

Alind=- (KME (2) "2) ♦logCt (2) ) - ( 1- (KME (2) -1) "2) ♦logC 1-t (2) ) ; 
for  i=3:r 

Alnid=Almd  +  (KME(i-l)"2-KME(i)"2)*log(t(i))-((KME(i-l)-l)"2-(KME(i)-l)"2)*log(l-t(i)) ; 

end; 

A2indl=-n+n*Almd; 
elseif  r==2 

A2indl=-n+n*(-log(t  (2))  -  log(l~t(2)))  ; 
end; 

elseif  t(l)>0 

Almd=“(KME(l)"2)*log(t(l))-(l-(KME(l)-l)"2)*log(l--t(l)); 
for  i=2:r 

Alind=Almd  +  (KME(i-l)"2-KME(i)"2)*log(t(i))-((KME(i-l)-l)"2-(KME(i)-l)"2)*log(l-t(i)) ; 
end; 

A2md  l=-’n+n*  A  Imd ; 
end; 


t=l-exp(-((FAIL~x2) ./ETAhatl) . "kappa) ; 
if  t(l)==0 
if  r>2 

A2ind=- (KME (2) "2) ♦logCt (2) ) - (1-(KME(2) -1) "2) *log(l-t (2) ) ; 
for  i=3;r 

A2nid=A2ind  +  (KME(i-l)"2-KME(i)"2)*log(t(i))-((KME(i-l)-l)"2-(KME(i)-l)"2)*log(l~t(i)) ; 
end; 

A2md2=-n+n* A2md ; 
elseif  r==2 

A2md2=“n+n*(-log(t (2))  -  log(l-t (2) ) ) ; 
end; 

elseif  t(l)>0 

A2md=-(KME(l)"2)*log(t(l))~(l-(KME(l)-l)"2)*log(l-t(l)); 
for  i=2:r 

A2md=A2md  +  (KME(i-l)"2-KME(i)"2)*log(t(i))-'((KME(i-l)-l)"2-(KME(i)-l)"2)*log(l-t(i)) ; 
end; 

A2md2=-n+n*  A2ind ; 
end; 

if  A2nidl  <  A2nid2 
rt=x2 ; 


172 


GAMhatA=xl; 
else 
lt=xl ; 

GAMhat A=x2 ; 
end; 

diff=abs(A2mdl-A2ind2)  ; 
end; 

ETAliatA=(sum((data(: ,1) -GAMhat A) . "kappa) /r) . "(1/kappa) ; 

^  :|e%:|e:|e3)c^3(e:fc:|e:|e:(c:|c4c:|e:|c))c:)c:(e:|c4c:|c:(e3|c:|e:|e4e:|e:|c:(c:fc:|c3|e:(e:|c:|c:te:fc)te3|e3|e3|c^3|e)|e:|e:|c:|c3((:)e:|(i|ei|c3(e9|c)fe:|c%9|e)|e}|c:|c4(:te9|c:(e4e%:)(4c9|c%:fe%:|c 

%  Analysis  for  AD-based  MDE 
U=l-exp(-((FAIL-GAMhatA)/ETAhatA) ."kappa) ; 


%  Construct  the  CvM  statistic 

Wsum=0 ; 
for  i=2:r 

Wsum=Wsum+(KME(i-l)"2)*(U(i)-U(i-l))-KME(i-l)*(U(i)"2-U(i-l)"2)+(U(i)."3-U(i-l)."3)/3; 

end; 

W2(m)=n*(U(l)."3)/3  +  n*Wsum; 


X  Construct  the  AD  statistic 


if  U(1)==0 


Asum=0 ; 
for  i==3:r 

Asuni=Asuiii+(KME(i-l)."2)*(log(U(i))-log(U(i-l)))-((KME(i-l)-l)."2)*(log(l-U(i))-log(l-U(i-l)) 

end; 

A2(in)=-n*  (U(2)+log(l-U(2)))+n*Asum; 

elseif  U(1)>0 

Asuin=0 ; 
for  i=2:r 

Asum=Asuin+(KME(i-l)  ."2)*(log(U(i))-log(U(i-l)))-((KME(i-l)-l)  ."2)*(log(l-U(i))-log(l-U(i-l)) 
end; 

A2  (m)  =-n*  (U  ( 1 )  +log  ( 1  -U  ( 1 )  )  )  +n*  Asum ; 

end; 

end; 

end; 

qbar=inean(qobs) ; 


173 


niW2=ineELn(W2)  ; 
sW2=std(W2); 
ininW2=inin(W2)  ; 
medW2=prctile(W2,50) ; 
inaxW2=inax(W2)  ; 

ppW2(l)=prctile(W2,75) ; 
ppW2 (2) =prct ile (W2 , 80) ; 
ppW2(3)=prctile(W2,85) ; 
ppW2(4)=prctile(W2,90) ; 
ppW2(5)=prctile(W2,95) ; 
ppW2(6)=prctile(W2,97.5) ; 
ppW2(7)=prctile(W2,99) ; 
ppW2(8)=prctile(W2,99.5) ; 
ppW2(9)=prctile(W2,99.9) ; 

mA2=mean(A2) ; 
sA2=std(A2) ; 
minA2=miii(A2)  ; 
medA2=prctile(A2,50) ; 
maxA2=max(A2) ; 

ppA2(l)-prctile(A2,75) ; 
ppA2(2)=prctile(A2,80) ; 
ppA2 (3) =prct ile (A2 , 85) ; 
ppA2(4)=prctile(A2,90) ; 
ppA2(5)=prctile(A2,95) ; 
ppA2(6)=prctile(A2,97.5) ; 
ppA2(7)=prctile(A2,99) ; 
ppA2(8)=prctile(A2,99.5) ; 
ppA2 (9) =prct ile (A2 ,99.9); 

fid=fopeIl(’W2ECM4.txt^  ’a’)  ; 

fprintf  (f  id,  ^Soiirce:  W2ECM1  )  ; 

fprintf  (fid,  ^Sample  size:  n  =  y,g\n\n); 

fprintf  (fid, ’Monte  Carlo  size:  N  =  %g\n’,iii); 

fprintf  (fid, ’Expected  prop,  censored:  q  =  %1.2f\n’,q); 

fprintf (fid, ’Observed  prop.  censored:qbar  =  %1 .2f \n’ ,qbar) ; 

fprintf (fid, ’  \n’) ; 

fprintf (fid, ’Failure  dist:  WEIBULL  loc=  7,g,scale=  Xg.  shape=  XgXn’ ,loc, eta, kappa) ; 
fprintf (fid, ’Censoring  distn:  EXP  scale  =  %g\n’ , Theta) ; 
fprintf (fid,’  \n ’ ) ; 

fprintf (fid, ’ ♦♦♦♦****♦***♦♦♦***♦**♦** ♦♦♦*♦****♦♦♦♦*♦*♦*♦*♦♦*♦♦***♦♦* **\n ’ )  ; 
fprintf (fid, ’  \n’) ; 

fprintf (fid, ’Percentage  Points  of  the  CvM  statistic:  Composite  Ho\n’); 
fprintf (fid,’  \n ’ ) ; 

fprintf (fid,’  0.25  &  0.20  &  0.15  &  0.10  &  0.05  &  0.025  &  0.01  &  0.005  &  0.001\n’) ; 

fprintf (f id, ’  \n’) ; 

fprintf  (fid,’ yol.4f  &  y,1.4f  &  7,1. 4f  &  7,1. 4f  &  7.1. 4f  &  7.1. 4f  &  7.1. 4f  &  7.1. 4f  &  7.1 .4f  \n’ ,ppW2) ; 
fprintf (f id, ’  \n’)  ; 

fprintf  (fid, ’Minimum  min  =  7.1 .6f  \n’ ,minW2)  ; 


174 


fprintf  (fid,  ^Median  med  =  %! .  6f \ii’ ,medW2)  ; 

fprintf  (fid, ’Maximum  max  =  “/*!  .6f \ii’ ,maxW2)  ; 

fprintf  (fid, ’Mean  E[W2]  =  y.!  .6f\n’ ,mW2)  ; 

fprintf  (fid, ’Standard  Deviation  sd[W2]  =  7,1 .6f\n’ ,sW2)  ; 

f printf (fid,’  \n  ’ )  ; 

fprintf (fid,’  \n  ’ )  ; 

fprintf (fid, ’Percentage  Points  of  the  AD  statistic;  Composite  Ho\n’); 
fprintf (fid,’  \n  ’ )  j 

fprintf(fid,’  0,25  &  0.20  &  0.15  &  0.10  &  0.05  &  0.025  &  0.01  &  0,005  &  0.001\n’) 

fprintf (fid, ’  \n’); 

fprintf(fid,’y.l.4f  &  7,1. 4f  &  71. 4f  &  7,1. 4f  &  7,1. 4f  &  7,1. 4f  &  7.1. 4f  &  7.1. 4f  &  7.1 .4f \n’ ,ppA2) ; 
fprintf (fid, ’  \n’); 

fprintf  (fid, ’Minimum  min  =  7.1 .6f  \n’ ,minA2)  ; 

fprintf  (fid, ’Median  med  =  y.l.6f\n’ ,medA2)  ; 

fprintf  (fid, ’Maximum  max  =  y.l.efXn’ ,maxA2)  ; 

fprintf  (fid, ’Mean  E[A2]  =  7.1 .6f\n’ ,mA2)  ; 

fprintf (fid, ’Standard  Deviation  sd[A2]  =  7l.6f\n’ ,sA2) ; 

fprintf  (fid,  ’  ******************************^**************:ifii^^:it^^^r^t*^*\rL^)  ; 
fprintf (fid, ’  \n’); 

fprintf (fid,’  \n  ’ )  ; 
f close (fid) ; 

sprint f  ( ’ W2ECM1 . m  with  n=7og ,  kappa=y.g ,  and  q=7g  ’ , n ,  kappa , q) 

toe 

end; 

quit 


175 


Appendix  L.  Power  Study  for  the  Exponential  with  Exponential  censoring 

y#  POWexpl.m  Exponential, Exponential  Censoring  Model 

/t  Monte  Carlo  POWER  study  of  CvM  and  AD  GOF  test  statistics  for  randomly 
%  censored  data  using  the  KME  in  place  of  the  EOF. 

%  Expected  proportion  censored;  q 

y.  Composite  Hypothesis:  Exponential  with  Exponential  censoring 

%  Written  by  Dave  Reineke,  1999 

h=0; 

rejW=zeros(10,4) ; 
re j  A=zeros (10,4); 

%  The  following  matrices  represent  percentage  points  for  the  CvM  and  AD 
y»  modified  test  stats:  rows  correspond  to  samples  sizes  20:20:200  and 
y.  columns  correspond  to  alpha  levels  .10,  .05,  ,025,  &  .01. 

PPql 

randC’seed* ,2) ; 

nrep=1000; 

beta=2 ; 
eta=2 ; 

q=.i; 

Theta=67.34; 

for  n=20;20:200 

h=h+l; 

m=0; 

tic 

while  m  <  1000 

clear  FAIL  KM  KME  km  U  t 

y  Construct  a  randomly  censored  data  set 

for  i=l:n 

y  t=eta* (-log ( 1-rand) ) ;  %  Lifetime  distribution  (Exp) 

'/,  t=eta*(-log(l-rand)).~(l/beta);  */,  Weibull  Alternative 

%  t=exp(norinmd(0,l));  '/.  Lognormal  Alternative 
7,  t=gamrnd (beta, eta) ;  ’/,  Gamma(chi-sq.)  Alt. 
t=100*rand;  7,  Uniform(0,100)  Alt. 

C=Theta*(-log(l-rand)) ;  7.  Censoring  Distribution  (Exp) 

X(i, l)=min(t,C) ; 
if  C<t 
X(i,2)=0; 


176 


else 

X(i,2)=l; 

end; 

end; 

[data,I]=sort(X(: ,1)) ; 
for  i=l:n 

data(i,2)=X(I(i),2); 

end; 

for  i=l;n 

kiii(i)  =  ((n-i)/(n-i+l))  ."data(i,2)  ; 
end; 

KM=cimiprod(km)  ; 

KM(n)=0; 

r=0; 

for  i=l:n 

if  data(i,2)==l 
r=r+l ; 

FAIL(r)=data(i , 1) ; 
KME(r)=l~KM(i); 
end; 
end; 


if  r>=2 
m=m+l ; 

qobs(m)=l-r/n; 

KME(r)=l; 

ETAhat=(suiii(data(:  ,  l))/r)  ;  %  Scale  MLE  for  Exponential  distn  only 

U=l“exp(~FAIL/ETAhat) ; 

%  Construct  the  CvM  statistic 

Wsum=0 ; 
for  i=2:r 

Wsum=Wsum+(KME(i-l)''2)*(U(i)-U(i-l))-KME(i-l)*(U(i)^2-U(i-l)''2)  +  (U(i).^3-U(i-l).^3)/3; 
end; 

W2=n*(U(l)  .  "3)/3  +  n*Wsnm; 

%  Construct  the  AD  statistic 


if  U(1)==0 


Asum=0 ; 
for  i=3:r 

Asum=Asum+(KME(i-l)  . ''2)*(log(U(i))“*log(U(i-l) ))-((KME(i-l)--l)  . "2)*(log(l--U(i))-log(l-U(i-l)) 
end; 


177 


A2=-n*(U(2)+log(l-U(2)))+ii*Asum; 


elseif  U(1)>0 

Asiim=0 ; 
for  i=2:r 

Asuin=Asiim+(KME(i-l).'^2)*(log(U(i))-log(U(i-<l)))-((KME(i-l)-l).'^2)*(log(l-U(i))-log(l-U(i~l)) 

end; 

A2=-n*  (U  ( 1 )  +log  ( 1-U  ( 1)  )  )  +n*  Asnm ; 

end; 

y,  Tally  the  number  of  test  statistics  over  given  percentage  points 


for  i=l:4 

if  W2  >  ppW(h,i) 
rejW(h,i)=rejW(h,i)+l; 
end; 

if  A2  >  ppA(h,i) 

rejA(h,i)=rejA(h,i)+l; 

end; 

end; 

end; 

end; 

qbar=mean(qobs)  ; 

powW=rejW/m; 

powA=rejA/m; 

f id=fopen(^POWexpl .txt’ , ’a’) ; 

fprintf  (fid,  *  Source:  POWexpl  ,m\nO  ; 

fprintf (fid, ’Sample  size;  n  =  ygXn’ ,n) ; 

fprintf (fid, ’Monte  Carlo  size;  N  =  XgXn’ ,m) ; 

fprintf  (fid, ’Expected  prop,  censored:  q=  7,1 . 2f  \n’ ,q)  ; 

fprintf (fid, ’Observed  prop,  censored:  qbar  =  71 . 2f \n’ ,qbar) ; 

fprintf (fid,’  \n ’ ) ; 

•/ofprintf  (fid, ’Failure  distn  (Alt):  WEIBULL  shape  =  y,g,  scale  =yg\n’ ,beta,eta) ; 
fprintf (fid, ’Alternative  distn:  Uniform  (0,100)\n’); 

fprintf (fid, ’Censoring  distn:  EXP  scale  =  %g\n’ , Theta) ; 

fprintf (fid, ’  \n’); 

fprintf  (fid, ’Hypothesized  Distribution:  Exponential\n’)  ; 

fprintf  (fid,  ’  ♦♦******5^*  ♦*♦*♦♦**♦♦♦♦♦*****♦*♦*♦♦*♦♦**♦♦***♦♦♦+♦♦♦♦  5|c5|c5i«*5(t\]i’ )  ; 
fprintf (fid, ’  \n’)  ; 

fprintf (fid, ’Estimated  power  of  the  CvM  statistic:  Composite  Ho\n’); 
fprintf (fid,’  \n ’ ) ; 

fprintf (fid,’  0.10  &  0.05  &  0.025  &  0.01\n’); 

fprintf (fid, ’  \n’) ; 

fprintf (fid, ’71. 4f  &  71. 4f  &  7l.4f  &  71. 4f  \n’ ,powW(h, : ) ’) ; 
fprintf (fid, ’  \n’) ; 


178 


f pr intf (fid,’  \n ’ ) ; 

fprintf (fid, ’Estimated  power  of  the  AD  statistic:  Composite  Ho\n’); 
f pr intf (fid,’  \n ’ ) ; 

fprintf(fid,’  0.10  &  0.05  &  0.025  &  0.01\n’); 

f pr intf (fid,’  \n’); 

fprintf  (fid,’%1.4f  &  7,1  Af  &  Xl.df  &  yol.4f  \n’ ,powA(h, : ) ’)  ; 
f pr intf (fid,’  \n ’ ) ; 

f printf  (fid,  ^ ♦  ♦♦♦*♦**** ********* +♦♦*♦♦♦  ♦♦*♦♦****♦*♦**+ ♦*+318 ) 
f pr intf (fid,’  \n ’ ) ; 
f close (fid) ; 

sprintf (’POWexpl.m  with  n=y,g  and  q=yg’,n,q) 

toe 

end; 

quit 

y,  POWerdl.m  Power  study  for  the  simultemeous  tests  of  crude  lifetimes  and 
y#  semi-parametric  tests  of  fit  with  crude  lifetimes  for  the 
y,  exponential  distribution  with  exponential  censoring. 

%  Written  by  Dave  Reineke,  May  1999 

h=0; 

rejWl=zeros(5,3) ; 
re j Al=zeros (5,3); 
rej W2=zeros (5,3); 
rejA2=zeros(5,3) ; 

%  The  following  matrices  represent  percentage  points  for  the  CvM  and  AD 
y  modified  test  stats:  rows  correspond  to  samples  sizes  20:20:100  and 
y,  columns  correspond  to  alpha  levels  .10,  .05,  .025,  &  .01 

ppW=[.175  .222  .271  .338]; 
ppA=  [1.062  1.321  1.591  1.959]; 


rand ( ’seed’ ,2) ; 

nrep=1000; 

beta-2; 

eta=2; 

q=.2; 

Theta=17 ; 

for  n=20:20:100 

h=h+l ; 

m=0; 

tic 


179 


while  m  <  nrep 


clear  FAIL  z  CENS 

%  Construct  a  randomly  censored  data  set 
for  i=l:n 

y#  t=eta*(-log(l‘-rand)) ;  */*  Lifetime  distribution  (Exp) 

%  t=et a*  (-log (l~rand))  (1/beta)  ;  %  Weibull  Alternative 

y,  t=exp(normrnd(0,l));  %  Lognormal  Alternative 
t=gainrnd(beta,eta)  ;  %  Gamma ( chi-sq. )  Alt. 

C=Theta* (-log ( 1-rand) ) ;  %  Censoring  Distribution  (Exp) 

X(i, l)=min(t ,C) ; 
if  C<t 
X(i,2)=0; 
else 

X(i,2)=l; 

end; 

end; 

[data,I]=sort (X(: ,1)) ; 
for  i=l;n 

data(i,2)=X(I(i),2); 

end; 


%  Separate  Crude  Failure  and  Censoring  Times 

r=0; 

K=0; 

for  i=l;n 

if  data(i,2)==l 
r=r+l; 

FAIL(r)=data(i, 1) ; 
elseif  data(i,2)==0 
K=K+1; 

CENS(K)=data(i,l); 

end; 

end; 

if  r>=2 
if  K>=2 
m==m+l; 

qobs  (m)=l-r/n; 

y#  Find  Scale  Parameter  Estimates  for  Each  "Exponential"  crude  lifetime  distribution 

Ll=mean(FAIL)  ; 

L2=mean(CENS) ; 


180 


%  Conduct  CvM  Tests  for  the  Exponential  for  the  FAIL  set  and  CENS  set 
zl=l-exp(-FAIL/Ll); 

W=0; 

for  i=l:r 

W=W  +  (zl(i)  -  (i-.5)/r)''2; 
end; 

W=W  +  l/(12*r); 

W1=W*(1  +  0.16/r);  "/,  CvM  stat  for  FAIL  set 
z2=l-exp(-CENS/L2); 

W=0; 

for  i=l;K 

W=W  +  (z2(i)  ~  (i-.5)/K)'^2; 
end; 

W=W  +  1/(12*K); 

W2=W*(1  +  0.16/K);  %  CvM  stat  for  CENS  set 

7o  Conduct  AD  Tests  for  the  Exponential  for  the  FAIL  set  and  CENS  set 
A=0; 

for  i=l:r 

A=A  +  (2*i-l)*(log(zl(i))  +  logd  -  zl(r+l“i))); 
end; 

A=  -r  -  (l/r)*A; 

Al=  A*(l  +  0.6/r);  %  AD  stat  for  FAIL  set 
A=0; 

for  i=l:K 

A=A  +  (2*i-l)*(log(z2(i))  +  log(l  -  z2(K+l“i))) ; 
end; 

A=  -K  -  (1/K)*A; 

A2=  A*(l  +  0.6/K);  %  AD  stat  for  CENS  set 

7,  Tally  the  number  of  test  statistics  over  given  percentage  points 
7o  For  the  Simultaneous  Crude  Life  Tests  (Reject  if  EITHER  test  rejects) 
for  i=l;3 

re  j  W1  (h,  i)  =re  jWl  (h,  i)  +1 ; 
if  W1  <  ppW(i+l) 
if  W2  <  ppW(i+l) 
rejWl(h,i)=rejWl(h,i)-l; 

end; 

end; 

rejAl(h,i)=rejAl(h,i)+l; 
if  A1  <  ppA(i+l) 
if  A2  <  ppA(i+l) 


181 


rejAl(h,i)=rejAl(h,i)-l; 

end; 

end; 

end; 

7o  For  the  Semi-Parametric  Crude  Life  Tests 

for  i=l;3 

if  W1  >  ppW(i) 
re  j  W2  (h ,  i)  =re  j  W2  (h ,  i  )  +1 ; 
end; 

if  A1  >  ppA(i) 
r e j  A2 (h , i ) =r e j  A2 (h , i) +1 ; 
end; 
end; 

end; 

end; 

end; 

qbar=mean(qobs) ; 

powWl=rejWl/m; 

powAl=rejAl/m; 

powW2=rejW2/m; 

powA2=rejA2/m; 

f id=fopen(’POWcrdl,txt ^ » ’aO  ; 

fprintf (f id, ^Source;  POWcrdl .m\n’ ) ; 

fprintf (fid, ’Sample  size:  n  =  7g\n’,n); 

fprintf  (fid, ’Monte  Carlo  size:  N  =  VogXn’ ,m)  ; 

fprintf  (fid, ’Expected  prop,  censored:  q  =  %l,2f\n’,q); 

fprintf  (fid, ’Observed  prop,  censored:  qbar  =  %1 . 2f  \n’ ,qbar)  ; 

fprintf (fid,’  \n ’ ) ; 

%fprintf  (fid, ’Failure  distn:  EXP  scale  =  y,g\n’,eta); 

y,fprintf  (fid, ’Failure  distn  (Alt):  WEIBULL  shape  =  y,g,  scale  =yog\n’ ,beta,eta)  ; 
fprintf  (fid, ’Alternative  distn:  Gamma  shape=yog,  scale=yog\n’ ,beta,eta)  ; 

fprintf (fid, ’Censoring  distn:  EXP  scale  =  XgXn’ , Theta) ; 

fprintf (fid,’  \n  ’ )  ; 

fprintf (fid, ’Hypothesized  Distribution:  Exponent ial\n ’ ) ; 

fprintf  (fid,  ’  ♦♦♦*****♦******♦♦♦♦***♦  ♦♦♦+♦♦♦♦♦♦*****♦*****  >it***s|t5|t5ic******\ii’ )  ; 
fprintf (fid, ’SIMULTANEOUS  CRUDE  LIFE  TEST  \n’); 

fprintf  (fid, ’Estimated  power  of  the  CvM  and  AD  statistic:  Composite  Ho\n’); 
fprintf (fid, ’  \n’) ; 

fprintf  (fid,’  0.10  &  0.05  &  0.025  \n’); 

fprintf (fid,’  \n ’ ) ; 

fprintf  (fid, ’CvM  y,1.4f  &  y,1.4f  &  7*1. 4f  \n’ ,powWl(h, : ) ’)  ; 
fprintf  (fid, ’AD  ya.4f  &  7.1. 4f  &  7,1. 4f  \n’ ,powAl(h, :) ’)  ; 
fprintf (fid,’  \n  ’ )  ; 
fprintf (fid, ’  \n’)  ; 


182 


fprintf (fid, ’SEMI-PARAMETRIC  CRUDE  LIFE  TEST  \n’); 

fprintf (fid, ’Estimated  power  of  the  CvM  statistic:  Composite  Ho\n’); 

fprintf (f id, ’  \n’); 

fprintf (fid,’  0.10  &  0.05  &  0.025  \n’); 

fprintf (fid, ’  \n’); 

fprintf(fid,’CvM  •/.1.4f  &  •/.1.4f  &  •/.1.4f  \n’ ,powW2(h, : ) ’) ; 
fprintf(fid,’AD  y.l.4f  &  y.l.4f  ft  y.l.4f  \n’ ,powA2(h, :) ’)  ; 

fprintf  (fid,  ’  ♦♦3|e3|e*J|t*3|e  +  ***j|cj|c*j|e*3(e**************************5|f3|t**5|t**+*****\n’  )  ; 
fprintf (fid,’  \n ’ ) ; 
f close(f id) ; 

sprintf  (’POWcrdl.m  with  n=7og  and  q=%g\n,q) 

toe 

end; 


183 


Appendix  M,  Power  Study  for  the  Weibull  with  Exponential  Censoring 

%  P0Ww2expl.in  Exponential, Exponential  Censoring  Model 

%  Monte  Carlo  POWER  study  of  CvM  and  AD  GOF  test  statistics  for  randomly 
*/o  censored  data  using  the  KME  in  place  of  the  EDF. 

%  Expected  proportion  censored:  q 

%  Composite  Hypothesis:  Exponential  with  Exponential  censoring 

%  Written  by  Dave  Reineke,  1999 

h=0; 

rejW=zeros(5,4) ; 
rejA=zeros(5,4) ; 

tic 

%  The  following  matrices  represent  percentage  points  for  the  CvM  and  AD 
%  modified  test  stats:  rows  correspond  to  samples  sizes  20:20:100  and 
%  columns  correspond  to  alpha  levels  .10,  .05,  .025,  &  .01. 

ppW= [0.171  0.208  0.245  0.294 
0.168  0.206  0.243  0.293 
0.167  0.204  0.241  0.293 
0.165  0.202  0.239  0.288 
0.164  0.201  0.238  0.288] ; 

ppA= [1.091  1.301  1.504  1.792 
1.076  1.286  1.496  1.768 
1.064  1.270  1.476  1.758 
1.048  1.253  1.457  1.726 
1.038  1.243  1.443  1.715]; 

randC’seed^ ,2) ; 

nrep=1000; 

kappa=2 ; 

beta=2; 

eta=50; 

loc=20; 

q=.2; 

Theta=193; 

for  n=20:20:100 

h=h+l ; 

m=0; 

while  m  <  nrep 

clear  FAIL  KM  KME  km  U  t 


184 


%  Construct  a  randomly  censored  data  set 


for  i=l:n 

Vo  t=eta*  (-log (1-rand) )+loc;  %  Lifetime  distribution  (Exp) 
t=eta*(-log(l-rand)) . "(l/beta)+loc;  %  Weibull  Alternative 

%  t=exp(normrnd(0.4,0.67))+loc;  */#  Lognormal  Alternative 
%  t=gamrnd(beta,eta)+loc;  %  Gamma  ( chi  -  sq. )  Alt. 

C=Theta* (-log ( 1-rand) )+loc;  %  Censoring  Distribution  (Exp) 

X(i , l)=min(t ,C) ; 
if  C<t 
X(i,2)=0; 
else 

X(i,2)=l; 

end; 

end; 

[data,I]=sort(X(: ,1)) ; 
for  i=l;n 

data(i,2)=X(I(i),2); 

end; 

for  i=l:n 

km(i)=( (n-i)/ (n-i+1)) . "data (i, 2) ; 
end; 

KM=cumprod(km) ; 

KM(n)=0; 

r=0; 

for  i=l:n 

if  data(i,2)==l 
r=r+l; 

FAIL(r)=data(i,l); 

KME(r)=l-KM(i); 

end; 

end; 


if  r>=2 
m=m+l ; 

qobs(m)=l-r/n; 

KME(r)=l; 

%  MD/ML  estimation  of  location  and  scale  parameters 
SORTDATA=data; 

yij  mLEs  for  location  &  scale  parameters***************** 

%  **************5i‘******Assume  shape  is  known***************************** 

y*  mdloc.m  Finds  the  minimum  distance  estimate  (MDE)  of  the  location 
*/«  parameter  of  a  3  parameter  Weibull  distribution  assuming 
%  known  shape. 


185 


%  Written  by  Dave  Reineke 

%  SORTDATA  is  a  n  by  2  matrix  of  failure  and  withdrawal  times. 

%  FAIL  is  the  set  of  failure  times  only. 

%  KME  is  the  Kaplan-Meier  estimator  of  the  distribution  function. 

GAMhatl=.999*data(l,l) ; 

ETAhatl=(sum((data(:  ,l)--GAMhatl)  ."kappa)/r)  ."(1/kappa)  ; 
t=l~exp(-((FAIL-GAMhatl) ./ETAhatl) . "kappa) ; 

%  MDE  using  the  Anderson-Darling  statistic  (KME  vs  MLE) 
diff=l; 

%  Golden  Search  algorithm 
alf =2/ ( 1+sqrt (5) ) ; 


lt=0; 

rt=min(FAIL)  ; 


c2=0; 

while  diff  >  .00000001 
c2=c2+l ; 
if  c2  ==  1000 
c2 

GAMhatA=.999*data(l , 1)  ; 
break ; 
end; 

xl=lt  +  (1-alf )* (rt-lt) ; 
x2=lt  +  alf* (rt-lt) ; 

t=l-exp(-((FAIL-xl) ./ETAhatl) . "kappa) ; 
if  t(l)==0 
if  r>2 

Almd=- (KME (2) "2) *log(t (2) ) - (1- (KME(2) -1) "2) *log(l-t (2) ) ; 
for  i=3:r 

Almd=Almd  +  (KME (i-1) "2-KME (i) "2) *log(t (i) ) - ( (KME(i-i)-l) "2- (KME(i) -1) "2) *log(l-t (i) ) ; 

end; 

A2mdl=-n+n*Almd ; 
elseif  r==2 

A2mdl=-n+n*(-log(t(2))  -  log(l-t (2))) ; 
end; 

elseif  t(l)>0 

Almd=-(KME(l)"2)*log(t(l))-(l-(KME(l)-l)"2)*log(l-t(l)); 
for  i=2:r 

Almd=Almd  +  (KME(i-l)"2-KME(i)"2)*log(t(i))-((KME(i-l)-l)"2-(KME(i)-l)"2)*log(l-t(i)) ; 
end; 


186 


A2mdl=-n+n*Aliiid ; 
end; 

t=l“-exp(~((FAIL-x2)  ./ETAhatl)  .  "kappa) ; 
if  t(l)==0 
if  r>2 

A2md=- (KME (2) "2) tlog(t (2) ) - ( 1- (KME (2) -1) "2) *log( 1-t (2) ) ; 
for  i=3;r 

A2nid=A2nid  +  (KME(i-l)"2-KME(i)"2)*log(t(i))-((KME(i-l)-l)"2-(KME(i)-l)"2)*log(l-t(i)) 
end; 

A2md2=-n+n*  A2ind ; 
elseif  r==2 

A2ind2=-n+n*(«log(t(2))  -  log(l*-t(2))) ; 
end; 

elseif  t(l)>0 

A2md=~(KME(l)"2)*log(t(l))-(l-(KME(l)-l)"2)*log(l-t(l)); 
for  i=2:r 

A2md=A2md  +  (KME(i-l)"2-KME(i)"2)*log(t(i))“((KME(i-l)-l)"2-(KME(i)-l)"2)*log(l-t(i)) ; 
end; 

A2md2=-n+n* A2md ; 
end; 

if  A2mdl  <  A2md2 
rt=x2 ; 

GAMhat=xl ; 
else 
lt=xl ; 

GAMhat=x2 ; 
end; 

diff=abs(A2mdl-A2md2) ; 
end; 

ETAhat= ( sum ( (data ( ; ,l)-GAMhat) kappa) /r) ."(1/kappa) ; 


U=l-exp(-((FAIL-GAMhat)/ETAhat) . "kappa) ; 

%  Construct  the  CvM  statistic 

Wsuin=0 ; 
for  i=2:r 

Wsujii=Wsum+(KME(i-l)"2)*(U(i)-U(i-l))-KME(i-l)*(U(i)"2-U(i-l)"2)  +  (U(i)."3-U(i-l)  ."3)/3; 
end; 

W2=n*(U(l)."3)/3  +  n*Wsum; 

%  Construct  the  AD  statistic 


187 


if  U(1)==0 


Asiini=0 ; 
for  i=3:r 

Asuiii=Asuni+(KME(i-l).''2)*(log(U(i))-log(U(i-l)))-((KME(i-l)-l)  .'^2)*(log(l-U(i))-log(l-U(i-l)) 
end; 

A2=-n*  (U  (2)  +log(l-U  (2)  )  )  +n*Asiiiii; 

elseif  U(1)>0 

Asuni=0; 
for  i=2;r 

Asum=Asum+(KME(i-l).''2)*(log(U(i))-log(U(i-l)))-((KME(i-l)-l).''2)*(log(l-U(i))-log(l-U(i-l)) 

end; 

A2=-n* (U ( 1 ) +log ( 1-U ( 1 ) ) ) +n* Asnm ; 

end; 

%  Tally  the  number  of  test  statistics  over  given  percentage  points 

for  i=l;4 

if  W2  >  ppW(h,i) 
rejW(h, i)=rejW(h,i)+l ; 
end; 

if  A2  >  ppA(h,i) 

rejA(h,i)=rejA(h,i)+l; 

end; 

end; 

end; 

end; 

qbar=iiieaLn(qobs)  ; 

powW=rejW/m; 

powA=rejA/m; 

f id=f  open ( ’ P0Ww2exp2 . t xt ’ , ^  a  0  ; 

fprintf (fid, ^Source:  P0Ww2expl .m\nO ; 

f printf (f id , ^ Sample  size:  n  =  %g\n’,n); 

fprintf (fid, ^ Monte  Carlo  size:  N  =  %g\n’,m); 

fprintf (fid, ’Expected  prop,  censored:  q  =  %! . 2f \n’ ,q) ; 

fprintf (fid, ’Observed  prop,  censored:  <lbar  =  %! . 2f \n’ ,qbar) ; 

fprintf (fid,’  \n ’ ) ; 

fprintf  (fid, ’Failure  distn  (Alt):  WEIBULL  shape  =  "/ag,  scale  =y,g\n’ ,  beta,  eta)  ; 

/(fprintf (fid, ’Alternative  distn:  Lognormal  from  N( .4, .67)\n’ ) ; 

fprintf (fid, ’Censoring  distn:  EXP  scale  =  %g\n’ , Theta) ; 

fprintf (fid, ’  \n’) ; 

fprintf (fid, ’Hypothesized  Distribution:  Exponential\n’) ; 

fprintf  (fid,  ’  ♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦^♦♦♦♦♦♦♦♦♦♦♦***********5i«***t****t********\n’ )  ; 
fprintf (fid,’  \n ’ ) ; 


188 


fprintf  (fid,  ^Estimated  power  of  the  CvM  statistic:  Composite  Ho\nO; 
fprintf  (f  id,  ’  \nO  ; 

fprintf  (fid,’  0.10  &  0.05  &  0.025  &  0.01\n’); 

fprintf (fid, ’  \n’)  ; 

fprintf (fid, ’CvM  Stat:  %1.3f  &  74. 3f  &  74. 3f  &  71. 3f  \n’ ,powW(h, : ) ; 
fprintf  (fid, ’AD  Stat:  74. 3f  &  74. 3f  &  74. 3f  &  74. 3f  \n’ ,powA(h, : ) ’) ; 
fprintf (fid, ’  \n’); 

fprintf (fid, ’ ) ; 
fprintf (fid, ’  \n’); 

f close(f id) ; 

sprintf (’P0Ww2expl .m  with  n=7og  and  q=7g’,n,q) 

end; 

toe 

quit 


189 


Bibliography 


1.  Aho,  M.,  L.  J.  Bain  and  M.  Engelhard!.  “Goodness-of-Fit  Tests  for  the  Weibull  Distribution 
with  Unknown  Parameters  and  Censored  Sampling,”  Journal  of  Statistical  Computing  and 
Simulation^  18:59-69  (1983). 

2.  Aho,  M.,  L.  J.  Bain  and  M.  Engelhard!.  “Goodness-of-Fit  Tests  for  the  Weibull  Distribu¬ 
tion  with  Unknown  Parameters  and  Heavy  Censoring,”  Journal  of  Statistical  Computing  and 
Simulation,  21:213-225  (1985). 

3.  Akritas,  Michael  G.  “Pearson-Type  Goodness-of-Fit  Tests:  The  Univariate  Case,”  Journal  of 
the  American  Statistical  Association,  55(401):222-230  (March  1988). 

4.  Anderson,  T.  W.  and  D.  A,  Darling.  “Theory  of  Certain”  Goodness  of  Fit”  Criteria  Based  on 
Stochastic  Processes,”  Annals  of  Mathematical  Statistics,  ^5:193-212  (1952). 

5.  Archer,  N.  P.  “A  Computational  Technique  for  Maximum  Likelihood  Estimation  with  Weibull 
Models,”  IEEE  Transactions  on  Reliability,  R-29:57-62  (1980). 

6.  Armitage,  P.  “The  Comparison  of  Survival  Curves,”  Journal  of  the  Royal  Statistical  Society 
A,  122:279-300  (1959). 

7.  Arther  V.  Peterson,  Jr.  “Expressing  the  Kaplan-Meier  Estimator  as  a  Function  of  Empirical 
Subsurvival  Functions,”  Journal  of  the  American  Statistical  Association,  7^(368):854“858 
(December  1977). 

8.  Bain,  Lee  J.  and  Charles  E.  Antle.  “Estimation  of  Parameters  in  the  Weibull  Distribution,” 
Technometrics,  P(4):621-627  (1967). 

9.  Berresford,  Geoffrey  C.  Calculus,  with  Applications  to  the  Management,  Social,  Behavioral 
and  Biomedical  Sciences,  Englewood  Cliffs,  NJ:  Prentice-Hall,  Inc.,  1989. 

10.  Bickel,  P.  J.  and  M.  Rosenblatt.  “On  Some  Global  Measures  of  Deviation  of  Density  Function 
Estimates,”  Annals  of  Statistics,  i:1071"1095  (1973). 

11.  Blum,  J.  R.  and  V.  Susarla.  “Maximal  Deviation  Theory  of  Density  and  Failure  Rate  Function 
Estimates  Based  on  Censored  Data.”  Multivariate  Analysis- V,  edited  by  P.R.  Krishnaiah.  NY: 
North-Holland  Publishing  Company,  1980. 

12.  Boyles,  Russel  A.  “On  the  Convergence  of  the  EM  Algorithm,”  Journal  of  the  Royal  Statistical 
Society  B,  45:A7-50  (1983). 

13.  Breslow,  N.  and  J.  Crowley.  “A  Large  Sample  Study  of  the  Life  Table  and  Product  Limit 
Estimates  Under  Random  Censorship,”  Annals  of  Statistics,  ^(3):437-453  (1974). 

14.  Breslow,  N.  E.  “Analysis  of  Survival  Data  Under  the  Proportional  Hazards  Model,”  Interna¬ 
tional  Statistical  Review,  .^5(l):45-58  (1975). 

15.  Burke,  M.  D.  “Tests  for  Exponentiality  Based  on  Randomly  Censored  Data,”  Nonparametric 
Statistical  Inference,  1:89-102  (1980). 

16.  Burke,  M.  D.,  et  al.  On  Random  Censorship:  A  Collection  of  Fifteen  Papers.  Ottowa,  Canada: 
Carleton  University,  Department  of  Mathematics  and  Statistics,  1981. 

17.  Burke,  M.  D.  and  Lathos  Horvath.  “Density  and  Failure  Rate  Estimation  in  a  Competing 
Risks  Model,”  Sankhya:  The  Indian  Journal  of  Statistics,  (series  A,  part  1):135“154  (1984). 


190 


18.  Burke,  Murray  D.,  Sander  Csorgo  and  Lathos  Horvath.  “A  Correction  and  Improvement  of 
’Strong  Approximations  of  Some  Biometric  Estimates  under  Random  Censorship’,”  Probability 
Theory  and  Related  Fields,  79:51-57  (1988). 

19.  Bush,  John  G.,  Brian  W.  Woodruff,  Albert  H.  Moore  and  Edward  J.  Dunne.  “Mod¬ 
ified  Cramer-von  Mises  and  Anderson-Darling  Tests  for  Weibull  Distributions  with  Un¬ 
known  Location  and  Scale  Parameters,”  Communications  in  Statistics  ~  Theory  and  Methods, 
f5(21):2465-2476  (1983). 

20.  Chen,  Chen-Hsin.  Correlation-Type  Goodness-of-Fit  Tests  for  Randomly  Censored  Data.  PhD 
dissertation,  Stanford  University,  1982. 

21.  Chen,  Chen-Hsin.  “A  Correlation-Type  Goodness-of-Fit  Test  for  Randomly  Censored  Data,” 
Biometrika,  7i  (2):315-322  (1984). 

22.  Chen,  J.  Goodness  of  Fit  Tests  Under  Random  Censorship.  PhD  dissertation,  Oregon  State 
University,  1975. 

23.  Chen,  Y.  Y.,  M.  Hollander  and  N.  A.  Langberg.  “Small-Sample  Results  for  the  Kaplan-Meier 
Estimator,”  Journal  of  the  American  Statistical  Association,  77(377);141-144  (March  1982). 

24.  Cheng,  Kuang-Fu.  “On  Almost  Sure  Representations  for  Quantiles  of  the  Product  Limit 
Estimator  with  Applications,”  Sankhya:  The  Indian  Journal  of  Statistics,  ^9 (Series  A,  part 
3):426-443  (1984). 

25.  Cheng,  Phillip  E.  and  Gwo  Dong  Lin.  “Maximum  Likelihood  Estimation  of  a  Survival  Function 
Under  the  Koziol-Green  Proportional  Hazards  Model,”  Statistics  &  Probability  Letters,  5:75- 
80  (January  1987). 

26.  Cohen,  Clifford  A.  “Maximum  Likelihood  Estimation  in  the  Weibull  Distribution  Based  on 
Complete  and  an  Censored  Samples,”  Technometrics,  7:579-588  (1965). 

27.  Cox,  D.  R.  “The  Analysis  of  Exponentially  Distributed  Life-Times  with  Two  Types  of  Failure,” 
Journal  of  the  Royal  Statistical  Society  B,  >^i:411-421  (April  1959). 

28.  Cox,  D.  R.  and  D.  Oakes.  Analysis  of  Survival  Data.  NY:  Chapman  and  Hall,  1984. 

29.  Csorgo,  Sander.  “Estimation  in  the  Proportional  Hazards  Model  of  Random  Censorship,” 
Statistics,  i9(3):437-463  (1988). 

30.  Csorgo,  Sander.  “Universal  Gaussian  Approximations  Under  Random  Censorship,”  Annals  of 
Statistics,  24{6):2744-277S  (1996). 

31.  Csorgo,  Sander  and  Lathos  Horvath.  “On  the  Koziol-Green  Model  of  Random  Censorship,” 
Biometrika,  55(2):391“401  (1981). 

32.  D’Agostino,  Ralph  B.  and  Michael  A.  Stephens.  Goodness  of  Fit  Techniques.  NY:  Marcel 
Dekker,  1986. 

33.  David,  H.  A.  and  M.  L.  Moeschberger.  The  Theory  of  Competing  Risks.  New  York:  MacMillan 
Publishing  Co.,  Inc.,  1978. 

34.  Dempster,  A.  P.,  N.  M  Laird  and  D.  B.  Rubin.  “Maximum  Likelihood  from  Incomplete  Data 
via  the  EM  Algorithm  (with  discussion),”  Journal  of  the  Royal  Statistical  Society  B,  1  :l-38 
(1977). 

35.  Doob,  J.  L.  “The  Brownian  Movement  and  Stochastic  Equations,”  Annals  of  Mathematics, 
^9(2):351-369  (1942). 


191 


36.  Doob,  J.  L.  “Heuristic  Approach  to  the  Kolmogorov-Smirnov  Theorems,”  Annals  of  Mathe¬ 
matical  Statistics^  ^0:393-402  (1949). 

37.  Drenick,  R.  F.  “The  Failure  Law  of  Complex  Equipment,”  Jounal  of  the  Society  of  Industrial 
and  Applied  Mathematics,  (^(4):680-690  (December  1960). 

38.  Duchesne,  Thierry,  Jaques  Rioux  and  Andrew  Luong.  “Minimum  Cramer-von  Mises  Dis¬ 
tance  Methods  for  Complete  and  Grouped  Data,”  Communications  in  Statistics-Theory  and 
Methods,  26{2)A01~m  (1997). 

39.  Durbin,  J.  “Weak  Convergence  of  the  Sample  Distribution  Function  When  Parameters  are 
Estimated,”  Annals  of  Statistics,  i(2):279-290  (1973). 

40.  Durbin,  J.  and  M.  Knott.  “Components  of  Cramer-von  Mises  Statistics,  I,”  Journal  of  the 
Royal  Statistical  Society  B,  ^:290-307  (1972). 

41.  Durbin,  J.,  M.  Knott  and  C.C.  Taylor.  “Components  of  Cramer-von  Mises  Statistics,  II,” 
Journal  of  the  Royal  Statistical  Society  B,  57(2):216-237  (1975). 

42.  Ebrahimi,  N.  and  M  Habibullah.  “Testing  to  Determine  the  Underlying  Distribution  Using 
Incomplete  Observations  When  the  Life  Time  Distribution  is  Proportionally  Related  to  the 
Censoring  Time  Distribution,”  Journal  of  Statistical  Computation  and  Simulation,  40:109- 
118  (1992). 

43.  Efron,  Bradley.  “The  Two  Sample  Problem  with  Censored  Data.”  Proceedings  of  the  Fifth 
Berkeley  Symposium  on  Mathematical  Statistics  and  Probability 4 .  831-853.  1967. 

44.  Elperin,  T.  and  I.  Gertsbakh.  “Estimation  in  a  Random  Censoring  Model  with  Incomplete 
Information,”  IEEE  Transactions  on  Reliability,  57(2):223-229  (June  1988). 

45.  Eubank,  R.  L.  and  V.  N.  LaRiccia.  “Location  and  Scale  Parameter  Estimation  from  Ran¬ 
domly  Censored  Data,”  Communications  in  Statistics  -  Theory  and  Methods,  ii(25):2869- 
2888  (1982). 

46.  Fleming,  Thomas  R.,  Judith  R.  O’Fallon,  Peter  C.  O’Brien  and  David  P.Harrington.  “Modified 
Kolmogorov-Smirnov  Test  Procedures  with  Applications  to  Arbitrarily  Right- Censored  Data,” 
Biometrics,  36:007-02b  (December  1980). 

47.  Foldes,  Antonia  and  Lidia  Rejto.  “Strong  Uniform  Consistency  for  Nonparametric  Survival 
Curve  Estimators  from  Randomly  Censored  Data,”  Annals  of  Statistics,  5(1):122-129  (1980). 

48.  Foldes,  A.,  L.  Rejto  and  B.B.  Winter.  “Strong  Consistencty  Properties  of  Nonparametric 
Estimators  for  Randomly  Censored  Data,  I:  The  Product-Limit  Estimator,”  Periodica  Math- 
ematica  Hungarica,  ii(3):233-250  (1980). 

49.  Foldes,  A.,  L.  Rejto  and  B.B.  Winter.  “Strong  Consistencty  Properties  of  Nonparametric  Es¬ 
timators  for  Randomly  Censored  Data,  II:  Estimation  of  Density  and  Failure  Rate,”  Periodica 
Mathematica  Hungarica,  12{l):15-29  (1981). 

50.  Fuchs,  Ronald  P.  A  N on-parametric  Probability  Density  Estimator  and  Some  Applications. 
PhD  dissertation.  Air  Force  Institute  of  Technology,  1984. 

51.  Gail,  M,  H  and  J.  H.  Ware.  “Comparing  Observed  Life  Table  Data  with  a  Known  Survival 
Curve  in  the  Presence  of  Random  Censorship,”  Biometrics,  55:385-391  (1979). 

52.  Gallagher,  Mark  A.  and  Albert  H.  Moore.  “Robust  Minimum-Distance  Estimation  Using  the 
3-Parameter  Weibull  Distribution,”  IEEE  Transactions  on  Reliability,  55(5):575-580  (1990). 

53.  Gehan,  E.  A.  “A  Generalized  Wilcoxon  Test  for  Comparing  Arbitrarily  Singly  Censored 
Samples,”  Biometrika,  5^:203-223  (1965). 


192 


54.  Gilbert,  J.  P.  Random  Censorship,  PhD  dissertation,  University  of  Chicago,  1962. 

55.  Gillespie,  Mary  Jo  and  Lloyd  Fisher.  “Confidence  Bands  for  the  Kaplan-Meier  Survival  Curve 
Estimate,”  Annals  of  Statistics,  7(4):920~924  (1979). 

56.  Gray,  Robert  J.  and  Donald  A.  Pierce.  “Goodness-of-Fit  Tests  for  Censored  Survival  Data,” 
Annals  of  Statistics,  7,7:552-563  (1985). 

57.  Habib,  M.  G.  and  D.  R.  Thomas.  “Chi-Square  Goodness-of  Fit  Tests  for  Randomly  Censored 
Data,”  Annals  of  Statistics,  i^(2):759-765  (1986). 

58.  Habib,  Mohamed  Gamal  Hass.  A  Chi-Square  Goodness-of  Fit  Test  for  Censored  Data,  PhD 
dissertation,  Oregon  State  University,  1981. 

59.  Hall,  W.  J.  and  Jon  A.  Wellner.  “Confidence  Bands  for  a  Survival  Curve  from  Censored 
Data,”  Biometrika,  ff7(l):133-143  (1980). 

60.  Harris,  T.  E.,  Paul  Meier  and  John  W.  Tukey.  “Timing  of  the  Distribution  of  Events  Between 
Observations,”  Human  Biology,  22:2A9~270  (1950). 

61.  Harter,  H.  L.  The  Method  of  Least  Squares  and  Some  Alternatives  -  Part  V,.  Technical  Report 
AD-A029507,  Aerospace  Research  Laboratory,  Wright-Patterson  Air  Force  Base,  OH,  1975. 

62.  Harter,  H.  L.  and  A.  H.  Moore.  “Maximum  Likelihood  Estimation  of  the  Parameters  of 
Gamma  and  Weibull  Populations  from  Complete  and  Censored  Samples,”  Technometrics, 
7:639-643  (1965). 

63.  Harter,  H.  L.  and  A.  H.  Moore.  “Maximum  Likelihood  Estimation  of  the  Parameters  of 
Gamma  and  Weibull  Populations  from  Complete  and  Censored  Samples,”  Technometrics, 
7:639-643  (1984). 

64.  Hobbs,  Jon  R.,  Albert  H.  Moore  and  Robert  M.  Miller.  “Minimum-Distance  Estimation  of 
the  Parameters  of  the  3-Parameter  Weibull  Distribution,”  IEEE  Transactions  on  Reliability, 
5^(5):495-496  (1985). 

65.  Hollander,  Myles  and  Edsel  A.  Pena.  “A  Chi-Squared  Goodness-of-Fit  Test  for  Randomly  Cen¬ 
sored  Data,”  Journal  of  the  American  Statistical  Association,  ^7(418) :458-463  (June  1992). 

66.  Hollander,  Myles  and  Frank  Proschan.  “Testing  to  Determine  the  Underlying  Distribution 
Using  Randomly  Censored  Data,”  Biometrics,  55:393-401  (1979). 

67.  Horvath,  Lajos  and  Richard  A.  Johnson.  “Tests  of  Fit  for  Composite  Hypothesis  with  Censored 
Data,”  Statistics  &  Decisions,  5:21-43  (1991). 

68.  Hoy  land,  Arnljot  and  Marvin  Rausand.  System  reliability  Theory:  Models  and  Statistical 
Methods,  NY:  John  Wiley  &  Sons,  Inc.,  1994. 

69.  Hyde,  John,  “Testing  Survival  Under  Right  Censoring  and  Left  Truncation,”  Biometrika, 
5^(2):225-230  (1977). 

70.  Kaplan,  E.  L.  and  Paul  Meier.  “Nonparametric  Estimation  from  Incomplete  Observations,” 
Journal  of  the  American  Statistical  Association,  55:457-481  (June  1958). 

71.  Kapur,  K.C.  and  L.R,  Lamberson.  Reliability  in  Engineering  Design,  NY:  John  Wiley  & 
Sons,  1977. 

72.  Karunamuni,  R.  J.  and  Song  Yang.  “Weak  and  Strong  Uniform  Consistency  Rates  of  Kernel 
Density  Estimates  for  Randomly  Censored  Data,”  Canadian  Journal  of  Statistics,  i5(4):349- 
359  (1991). 


193 


73.  Keilegom,  Ingrid  Van  and  Noel  Veraverbeke,  “Uniform  Strong  Convergence  Results  for  the 
conditional  Kaplan-Meier  Estimator  and  its  Quantiles,”  Communications  in  Statistics  -  The¬ 
ory  and  Methods,  j?5:2251-2265  (1996). 

74.  Kim,  Jee  Soo  and  Frank  Proschan.  “Piecewise  Exponential  Estimator  of  the  Survivor  Func¬ 
tion,”  IEEE  Transactions  on  Reliability,  .f0(2):134-139  (1991). 

75.  Kim,  Joo  Han.  “Chi-Square  Goodness-of-Fit  Tests  for  Randomly  Censored  Data,”  Annals  of 
Statistics,  ^i(3):1621-1639  (1993). 

76.  Kitchen,  John,  Naftali  A.  Langberg  and  Prank  Proschan.  “A  New  Method  for  Estimating  Life 
Distributions  from  Incomplete  Data,”  Statistics  &  Decisions,  1:241-255  (1983). 

77.  Klein,  John  P.  and  M.  L.  Moeschberger.  “The  Robustness  of  Several  Estimators  of  the  Sur¬ 
vivorship  Function  with  Randomly  Censored  Data,”  Communications  in  Statistics  -  Simula¬ 
tion,  15(3):1087-1112  (1989). 

78.  Klein  John  P.,  Shih-Chang  Lee  and  M.  L.  Moeschberger.  “A  Partially  Parametric  Estimator 
of  Survival  in  the  Presence  of  Randomly  Censored  Data,”  Biometrics,  <^^:795-811  (September 
1990). 

79.  Kouassi,  Djokouri  A.  and  Jagbir  Singh.  “A  Semiparametric  Approach  to  Hazard  Estimation 
with  Randomly  Censored  Observations,”  Journal  of  the  American  Statistical  Association, 
P^(440):1351-1355  (December  1997). 

80.  Koziol,  James.  “Goodness-of-Fit  Tests  for  Randomly  Censored  Data,”  Biometrika,  ^7(3):693- 
696  (1980). 

81.  Koziol,  James  A.  and  Sylvan  B.  Green.  “A  Cramer-von  Mises  Statistic  for  Randomly  Censored 
Data,”  Biometrika,  tf«^(3):465-474  (1976). 

82.  Kumazawa,  Yoshiki.  “A  Note  on  an  Estimator  of  Life  Expectancy  with  Random  Censorship,” 
Biometrika,  7^  (3):655-658  (1987). 

83.  Lagakos,  S.  W.  “General  Right  Censoring  and  its  Impact  on  the  Analysis  of  Survival  Data,” 
Biometrics,  55:139-156  (March  1979). 

84.  Lagakos,  S.  W.  and  J.  S.  Williams.  “Models  for  Censored  Survival  Analysis:  A  Cone  Class  of 
Variable-Sum  Models,”  Biometrika,  55(1):181-189  (1978). 

85.  Law,  Averill  M.  and  W.  David  Kelton.  Simulation  and  Modeling  Analysis.  NY:  McGraw-Hill, 
Inc.,  1991. 

86.  Lawless,  J.  F,  Statistical  Models  and  Methods  for  Lifetime  Data.  New  York:  John  Wiley  & 
Sons,  Inc.,  1982. 

87.  Leemis,  Lawrence  M.  Reliability:  Probabilistic  Models  and  Statistical  Methods.  Englewood 
Cliffs,  New  Jersey:  Prentice-Hall,  Inc.,  1995. 

88.  Lemon,  G.  H.  “Maximum  Likelihood  Estimation  for  the  Three  Parameter  Weibull  Distribution 
Based  on  Censored  Samples,”  Technometrics,  17:247-254  (1975). 

89.  Link,  William  A.  “A  Model  for  Informative  Censoring,”  Journal  of  the  American  Statistical 
Association,  5^(407):749-752  (September  1989). 

90.  Lio,  Y.  L.  and  W.  J.  Padgett.  “Some  Convergence  Results  for  Kernel-Type  Quantile  Estima¬ 
tors  Under  Censoring,”  Statistics  &  Probability  Letters,  5:5-14  (January  1987). 

91.  Lio,  Y.  L.  and  W.  J.  Padgett.  “Asymptotically  Optimal  Bandwidth  for  a  Smooth  Nonpara- 
metric  Quantile  Estimator  Under  Censoring,”  Nonparametric  Statistics,  1:219-229  (1992). 


194 


92.  Marron,  J.  S.  and  W.  J.  Padgett.  “Asymptotically  Optimal  Bandwodth  Selection  for 
Kernel  Density  Estimators  from  Randomly  Right-Censored  Samples,”  Annals  of  Statistics^ 
i5(4):1520-1535  (1987). 

93.  Mauro,  David.  “A  Combinatoric  Approach  to  the  Kaplan-Meier  Estimator,”  Annals  of  Statis¬ 
tics,  i5{l):142-149  (1985). 

94.  McNichols,  Diane  T.  and  W.  J.  Padgett.  “A  Modified  Kernel  Density  Estimator  fot  Randomly 
Censored  Data,”  South  African  Statistical  Journal,  18:13-27  (1984). 

95.  McNichols,  Diane  T.  and  W.  J.  Padgett.  “Nonparametric  Methods  for  Hazard  Rate  Estimation 
from  Right-Censored  Samples,”  Journal  of  the  Chinese  Statistical  Association,  ^5 (special 
issue):l-15  (1985). 

96.  McNichols,  Diane  T.  and  W.  J.  Padgett.  “Mean  and  Variance  of  a  Kernel  Density  Estimator 
Under  the  Koziol-Green  Model  of  Random  Censorship,”  Sankhya:  The  Indian  Journal  of 
Statistics,  (part  2,  series  A):150“168  (1986). 

97.  Meier,  Paul.  “Estimation  of  a  Distribution  Function  from  Incomplete  Observations,”  Perspec¬ 
tives  in  Probability  and  Statistics  -  Papers  in  Honour  of  M.  S.  Bartlett,  67-87  (1976). 

98.  Meilejson,  Isaac.  “A  Fast  Improvement  to  the  EM  Algorithm  on  its  Own  Terms,”  Journal  of 
the  Royal  Statistical  Society  B,  51  (1):127-138  (1989). 

99.  Miller,  Rupert.  “What  Price  Kaplan-Meier?,”  Biometrics,  55:1077-1081  (December  1983). 

100.  Moeschberger,  M.  L.  “Life  Tests  Under  Competing  Causes  of  Failure,”  Technometrics, 
i5(l):39-47  (1974). 

101.  Moeschberger,  M.  L.  and  John  P.  Klein.  “A  Comperison  of  Several  Methods  of  Estimating 
the  Survival  Function  when  there  is  Extreme  Right  Censoring,”  Biometrics,  .^i:39-47  (March 
1985). 

102.  Muller,  Hans-Georg  and  Jane-Ling  Wang.  “Hazard  Rate  Estimation  under  Random  Censoring 
with  Varying  Kernels  and  Bandwidths,”  Biometrics,  55:61-76  (March  1994). 

103.  Nair,  V.  N.  “Plots  and  Tests  for  Goodness  of  Fit  with  Randomly  Censored  Data,”  Biometrika, 
55(1):99-103  (1981). 

104.  O’Neill,  Terrence  J.  “A  Goodness-of-Fit  Test  for  One-Sample  Life  Table  Data,”  Journal  of 
the  American  Statistical  Association,  75(385):194-199  (March  1984). 

105.  Oranda-Ordaz,  Francisco  J.  “Relative  Efficiency  of  the  Kaplan-Meier  Estimator  Under  Con¬ 
tamination,”  Communications  in  Statistics  -  Simulation,  i5(4):987-997  (1987). 

106.  Orchard,  T.  and  M.  A.  Woodbury.  “A  Missing  Information  Principle:  Theory  and  Applica¬ 
tions.”  Proceedings  of  the  6th  Berkeley  Symposuim  on  Mathematical  Statistics  and  Probability, 
697-715.  1972. 

107.  Padgett,  W.  J.  “A  Kernel-Type  Estimator  of  a  Quantile  Function  from  Right  Censored  data,” 
Journal  of  the  American  Statistical  Association,  5i(393):215-222  (March  1986). 

108.  Padgett,  W.  J.  “Nonparametric  Estimation  of  Density  and  Hazard  Rate  Functions  when 
Samples  are  Censored,”  Handbook  of  Statistics,  7:313-331  (1988). 

109.  Padgett,  W.  J.  and  Diane  T.  McNichols.  “Nonparametric  Density  Estimation  from  Censored 
Data,”  Communications  in  Statistics  -  Theory  and  Methods,  i 5: 1581-1611  (1984). 

110.  Palm,  C.  “Intensitaetsschnankungen  im  Fernsprechverkehr,”  Ericsson  Technics,  ^^.^:3-189 
(1943). 


195 


111.  Parr,  W.  C.  and  W.  R.  Schucany.  “Minimum  Distance  and  Robust  Estimation,”  Journal  of 
the  American  Statistical  Association,  75:616-624  (1980). 

112.  Patil,  P.  On  Kernel  Density  Estimation  under  the  Koziol-Green  ModeL  Technical  Report, 
Austrailan  National  University,  1991. 

113.  Quenouille,  M.  H.  “Approximate  Tests  of  Correlation  in  Time  Series,”  Journal  of  the  Royal 
Statistical  Society  B,  ii:68-84  (1949). 

114.  Quenouille,  M.  H.  “Notes  on  Bias  in  Estimation,”  Biometrika,  >^5:353-360  (1956). 

115.  Rockette,  Howard,  Charles  Antle  and  Lawrence  Klimko.  “Maximum  Likelihood  Estimation 
with  the  Weibull  Model,”  Journal  of  the  American  Statistical  Association,  55:246-249  (1974). 

116.  Rosenblatt,  M.  “Curve  Estimates,”  Annals  of  Mathematical  Statistics,  ^^:1815“1842  (1971). 

117.  Rufflin,  Scott  J.  Optimum  Preventive  Maintenance  Policies  for  the  AMRAAM  Missile.  MS 
thesis.  Air  Force  Institute  of  Technology,  1998. 

118.  Schmidt,  Peter  and  Ann  Dryden  Witte.  Predicting  Recidivism  Using  Survival  Models.  NY: 
Springer- Verlag,  1988. 

119.  Schmidt,  Peter  and  Ann  Dryden  Witte.  “Predicting  Criminal  recidivism  Using  ‘Split  Popu¬ 
lation’  Survival  Time  Models,”  Journal  of  Econometrics,  >^5:141-159  (1989). 

120.  Silverman,  B.  W.  Density  Estimation  for  Statistics  and  Data  Analysis  (Monographs  on  Statis¬ 
tics  and  Applied  Probability.  NY:  Chapman  and  Hall  Ltd.,  1986. 

121.  Srinivasan,  R.  “An  Approach  to  Testing  the  Goodness  of  Fit  of  Incompletely  Specified  Dis¬ 
tributions,”  Biometrika,  57(3):605-611  (1970). 

122.  Stephens,  M.  A.  “Use  of  the  Kolmogorov- Smirnov,  Cramer-von  Mises  and  Related  Statistics 
Without  Extensive  Tables,”  Journal  of  the  Royal  Statistical  Society  B,  5^:115-122  (1970). 

123.  Stephens,  M.  A.  “EDF  Statistics  for  Goodness-of-Fit  and  Some  Comparisons,”  Journal  of  the 
American  Statistical  Association,  55:730-737  (1974). 

124.  Stute,  W.  and  J.  L.  Wang.  “The  Strong  Law  under  Random  Censorship,”  Annals  of  Statistics, 
^i(3):1592~1607  (1993). 

125.  Stute,  Winfried.  “The  Censtral  Limit  Theorem  under  Random  Censorship,”  Annals  of  Statis¬ 
tics,  23{2)A22-m  (1995). 

126.  Stute,  Winfried  and  Jane-Ling  Wang.  “The  JackKnife  Estimate  of  a  Kaplan-Meier  Integral,” 
Biometrika,  (3):602-606  (1994). 

127.  Sun,  Feng-Bin  and  Dimitri  B.  Kececioglu.  “A  New  Method  for  Obtaining  the  TTT  Plot  for 
a  Censored  Sampe.”  Proceedings  of  the  Annual  Reliability  and  Maintainability  Symposium. 
112-116.  1999. 

128.  Sundberg,  Rolf.  “Maximum  Likelihood  Theory  for  Incomplete  Data  from  an  Exponential 
Family,”  Scandinavian  Journal  of  Statistics,  i:49-58  (1976). 

129.  Susarla,  V.  and  J.  Van  Ryzin.  “Nonparametric  Bayesian  Estimation  of  Susvival  Curves  from 
Incomplete  Observations,”  Journal  of  the  American  Statistical  Association,  7J  (356) :897-902 
(December  1976). 

130.  Susarla,  V.  and  J.  Van  Ryzin.  “A  Large  Sample  Theory  for  a  Nonparametric  Survival  Curve 
Estimator  Based  on  Censored  Samples,”  Annals  of  Statistics,  5(4):755-768  (1978). 


196 


131.  Sweeder,  James.  Nonparametric  Estimation  of  Distribution  and  Density  Functions  with  Ap¬ 
plications,  PhD  dissertation,  Air  Force  Institute  of  Technology,  1982. 

132.  Tanner,  Martin  A.  “A  Note  on  the  Variable  Kernel  Estimator  of  the  Hazard  Function  from 
Randomly  Censored  Data,”  Annals  of  Statistics,  if  (3):994-998  (1983). 

133.  Tanner,  Martin  A.  and  Wing  Hung  Wong.  “The  Estimation  of  the  Hazard  Function  from 
Randomly  Censored  Data  bt  the  Kernel  Method,”  Annals  of  Statistics,  11  (3):989-993  (1983). 

134.  Thomas,  David  R.  and  Gary  L.  Grunkemeier.  “Confidence  Interval  Estimation  of  Sur¬ 
vival  Probabilities  for  Censored  Data,”  Journal  of  the  American  Statistical  Association, 
7P(352):865-871  (1975). 

135.  Turnbull,  B.  W.  “The  Empirical  Distribution  Function  with  Arbitrarily  Grouped,  Censored 
and  Truncated  Data,”  Journal  of  the  Royal  Statistical  Society  B,  3:290-295  (1976). 

136.  Turnbull,  B.  W.  and  L.  Weiss.  “A  Likelihood  Ratio  Statistic  for  Testing  Goodness-of-Fit  with 
Randomly  Censored  Data,”  Biometrics,  3^:367-375  (1978). 

137.  van  der  Vaard,  Aad.  “Maximum  Likelihood  Estimation  with  Partially  Censored  Data,”  Annals 
of  Statistics,  ^<g(4):1896-1916  (1994). 

138.  Veterans  Administration  Cooperative  Urological  Research  Group.  “Treatment  and  Survival  of 
Patients  with  Cancer  of  the  Prostate,”  Surgery,  Gynecology,  and  Obstetrics,  124  AOll-1017 
(1967). 

139.  Wang,  Jia-Gang.  “A  Note  on  the  Uniform  Consistency  of  the  Kaplan-Meier  Estimator,” 
Annals  of  Statistics,  i5(3):1313-1316  (1987). 

140.  Wang,  Jin  and  Jaiding  Chen.  “On  the  Strong  Consistency  of  teh  Maximum  Likelihood  Esti¬ 
mators  from  Randomly  Censored  Samples,”  International  Journal  of  Reliability,  Quality  and 
Safety  Engineering,  ^:35-53  (1997). 

141.  Wei,  L.  J.  “Testing  Goodness  of  Fit  for  Proportional  Hazards  with  Censored  Observations,” 
Journal  of  the  American  Statistical  Association,  7P(387):649-652  (September  1984). 

142.  Wellner,  John  A.  “A  Heavy  CensoringLimit  Theorem  for  the  Product  Limit  Estimator,” 
Annals  of  Statistics,  i3(l):150-162  (1985). 

143.  Westberg,  Ulf  and  Bengt  Klefsjo.  “TTT-Plotting  for  Censored  Data  Based  on  the  Piecewise 
Exponential  Estimator,”  International  Journal  of  Reliability,  Quality  and  Safety  Engineering, 
i(l):l-13  (1994). 

144.  Whittemore,  A.  S.  and  J.  B.  Keller.  Survival  Estimation  with  Censored  Data.  Technical 
Report  69,  Stanford  University,  1983. 

145.  Williams,  J.  S.  and  S.  W.  Lagakos.  “Independent  and  Dependent  Censoring  Mechanisms.” 
Proceedings  of  the  9th  International  Biometric  Conference!.  408-427.  1976. 

146.  Williams,  J.  S.  and  S,  W.  Lagakos.  “Models  for  Censored  Survival  Analysis:  Constant-Sum 
and  Variable-Sum  Models,”  Biometrika,  ff^(2):215-224  (1977). 

147.  Wingo,  Dallas  R.  “Maximum  Likelihood  Estimation  of  the  Parameters  of  the  Weibull  Distri¬ 
bution  by  Modified  Quasilinearization,”  IEEE  Transactions  on  Reliability,  R-21 :89-93  (1972). 

148.  Wolfowitz,  J.  “Estimation  by  the  Minimum  Distance  Method,”  Annals  of  the  institute  of 
Statistical  Mathematics,  5:9-23  (1953). 

149.  Wolfowitz,  J.  “The  Minimum  Distance  Method,”  Annals  of  Mathematical  Statistics,  28:7b~8S 
(1957). 


197 


150.  Wu,  C.  F.  J.  “On  the  Convergence  Properties  of  the  EM  Algorithm,”  Annals  of  Statistics, 
jfi(l):95-103  (1983). 

151.  Yandell,  Brian  S.  “Nonparametric  Inference  for  Rates  with  Censored  Survival  Data,”  Annals 
of  Statistics,  if  (4):1119-1135  (1983). 

152.  Yang,  G.  “Life  Expectancy  Under  Random  Censorship,”  Stochastic  Processes  and  Applica¬ 
tions,  6:33-39  (1977). 

153.  Zhou,  Mai.  “Some  Properies  of  the  Kaplan-Meier  Estimator  for  Independent  Nonidentically 
Distributed  Random  Variables,”  Annals  of  Statistics,  ^:2266-2274  (1991). 


198 


Vita 


Mr.  David  M,  Reineke  was  born  in  Troy,  Ohio,  on  October  14,  1966.  He  graduated  from 
Tippecanoe  High  School  in  Tipp  City,  Ohio,  in  1985.  Mr.  Reineke  graduated  from  Wright  State 
University  with  a  B.S.  in  Secondary  Mathematics  Education  in  1991  and  an  M.S.  in  Applied 
Statistics  in  1994.  He  worked  as  a  graduate  teaching  assistant  in  the  Department  of  Mathematics 
and  Statistics  at.  Wright  State  University  during  the  1993-1994  school  year  and  has  worked  as  a  full 
time  instructor  in  the  department  from  1994  to  the  present  day.  Mr.  Reineke  entered  the  AFIT 
family  under  a  DAGSI  scholarship  in  the  Fall  of  1996  to  pursue  a  Ph.D.  in  Applied  Statistics. 


Permanent  address:  8156  Mt.  Charles 

Huber  Heights,  OH  45424 


199 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
0MB  No.  0704-0188 


Public  reporting  burden  for  this  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources, 
gathering  and  maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this 
collection  of  information,  including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson 
Davis  Highway,  Suite  1204,  Arlington,  VA  22202-4302,  and  to  the  Office  of  Management  and  Budget,  Paperwork  Reduction  Project  (0704-0188),  Washington,  DC  20503. 


1.  AGENCY  USE  ONLY  (Leave  blank)  2.  REPORT  DATE 


3.  REPORT  TYPE  AND  DATES  COVERED 


June  1999 


TITLE  AND  SUBTITLE 


Doctoral  Dissertation 


5.  FUNDING  NUMBERS 


Estimation  and  Goodness-of-Fit  in  the  Case  of  Randomly  Censored  Lifetime  Data 


6.  AUTHOR(S) 

Mr.  David  M.  Reineke 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 
Air  Force  Institute  of  Technology 
2950  P  Street 

Wright-Patterson  AFB,  OH,  45433-7765 


8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 

AFIT/DS/ENC/99-01 


9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 
Lt  Col  Ken  Bruner 
HQ  AFOTEC/TSE 
8500  Gibson  Blvd  SE 
Kirtland  AFB,  NM  87117 


10.  SPONSORING/MONITORING 
AGENCY  REPORT  NUMBER 


1 1 .  SUPPLEMENTARY  NOTES 

Advisor: 

Maj  John  S.  Crown,  1-937-255-3636  x4513,  john.crown@afit.af.mil 


12a.  DISTRIBUTION  AVAILABILITY  STATEMENT 
Approved  for  public  release;  Distribution  Unlimited 


12b.  DISTRIBUTION  CODE 


1 3.  ABSTRACT  (Maximum  200  \A/ords} 

A  new  continuous  distribution  function  estimator  for  randomly  censored  data  is  developed,  discussed,  and  compared  to 
existing  estimators.  Minimum  distance  estimation  is  shown  to  be  effective  in  estimating  Weibull  location  parameters  when 
random  censoring  is  present.  A  method  of  estimating  all  3  parameters  of  the  3-parameter  Weibull  distribution  using  a 
combination  of  minimum  distance  and  maximum  likelihood  is  also  given.  Cramer-von  Mises  and  Anderson-Darling 
goodness-of-fit  test  statistics  are  modified  to  measure  the  discrepancy  between  the  maximum  likelihood  estimate  and  the 
Kaplan-Meier  product-limit  estimate  of  the  distribution  function  of  the  random  variable  of  interest.  These  modified  test 
statistics  are  used  to  construct  goodness-of-fit  tests  for  the  exponential,  Weibull  (shape  2),  and  Weibull  (shape  3.5) 
distributions  when  the  censoring  distribution  is  assumed  to  be  exponential.  Percentage  points  are  obtained  via  Monte  Carlo 
simulation.  More  generally,  elements  of  competing  risks  theory  are  used  to  build  goodness-of-fit  tests  using  crude  lifetimes. 
For  tests  based  on  crude  lifetimes,  the  assumption  of  an  exponentially  distributed  censoring  variable  and  special  estimation 
techniques  are  no  longer  required.  Further,  complete  sample  goodness-of-fit  techniques  may  be  used,  bringing  much  more 
flexibility  to  goodness-of-fit  testing  when  samples  are  randomly  right-censored. 


14.  SUBJECT  TERMS 


Randomly  Censored  Data,  Goodness  of  Fit,  Competing  Risks,  Crude  Lifetimes,  Kaplan-Meier, 
Anderson-Darling,  Minimum  Distance  Estimation 


15.  NUMBER  OF  PAGES 

215 

16.  PRICE  CODE 


17.  SECURITY  CLASSIFICATION  18.  SECURITY  CLASSIFICATION 
OF  REPORT  OF  THIS  PAGE 


19.  SECURITY  CLASSIFICATION  20.  LIMITATION  OF  ABSTRACi 
OF  ABSTRACT 


UNCLASSIFIED 


UNCLASSIFIED 


UNCLASSIFIED 


Standard  Form  298  (Rev.  2-89)  (EG) 

Prescribed  by  ANSI  Std.  239.18 

Designed  using  Perform  Pro,  WHS/DIOR,  Oct  94 


